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ABSTRACT 


This is Volume I of two volumes of the Proceedings of the 3rd Annual 
Conference on Aerospace Computational Control. 

The term Computational Control was coined this year to encompass that 
range of computer-based tools and capabilities needed by aerospace control 
systems engineers for design, analysis, and testing of current and future 
missions. This year's conference furthered the dialogue in this area begun 
at the 1987 Workshop on Multibody Simulation in Pasadena and continued 
at the 1988 Workshop on Computational Aspects in the Control of Flexible 
Systems in Williamsburg, Virginia. 

A group of over 200 engineers and computer scientists representing 
government, industry, and universities convened at Oxnard for a three-day 
intensive conference on computational control. The conference consisted of 
thirteen sessions with a total of 107 technical papers in addition to opening 
and closing panel discussions. 

Conference topics included definition of tool requirements, advanced 
multibody simulation formulations, articulated multibody component 
representation descriptions, model reduction, parallel computation, real time 
simulation, control design and analysis software, user interface issues, 
testing and verification, and applications to spacecraft, robotics, and aircraft. 
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EXECUTIVE SUMMARY 


The term Computational Control was coined this year to encompass that range of computer-based 
tools and capabilities needed by aerospace control systems engineers for design, analysis, and testing 
of current and future missions. This year's conference advanced the dialogue in this area begun at 
the 1987 Workshop on Multibody Simulation in Pasadena and continued at the 1988 Workshop on 
Computational Aspects in the Control of Flexible Systems in Williamsburg, Virginia. 

A group of over 200 engineers and computer scientists representing government, industry, and 
universities convened at Oxnard for a three-day intensive conference on computational control. The 
conference consisted of thirteen sessions with a total of 107 technical papers. During the conference, 
eleven prominent computer hardware and control system design software companies displayed their 
products via hands-on demonstrations. 

It was confirmed at the conference that the current control design and simulation tools are a 
limiting factor in today's control design and testing and are inadequate for future needs. The areas of 
concern are: A) Control design tools break down for high order systems. B) Spacecraft simulation 
tools are too slow to be used effectively for design and testing. C) An integrated computer-aided 
modeling-design-simulation environment is needed to improve productivity. 
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INTRODUCTION 


The objective of the Computational Control Program is to spur the development of a new 
generation of articulated multibody spacecraft and robot modeling, control design, and simulation 
software tools. In order to assure a research approach that meets the needs of the community, it is 
necessary to bring the researcher, hardware and software developer, and user communities together 
on a periodic basis to allow exchange of ideas on requirements and developing capability. This 
conference provides a forum for that exchange. The term Computational Control was coined this 
year and encompasses the effort begun at the 1987 Workshop on Multibody Simulation in Pasadena 
and continued at the 1988 Workshop on Computational Aspects in the Control of Flexible Systems 
in Williamsburg, Virginia. These workshops are now considered the first and second annual 
conferences (workshops) on Computational Control, respectively. 

The objectives of the 3rd annual Conference on Aerospace Computational Control were: 1) To 
provide NASA, NSF, and DOD a window into current computational control research and 
development in the areas of multibody dynamics formulation, concurrent processing, computational 
techniques, and control analysis and simulation software. 2) To allow researchers and tool 
developers to exchange ideas and experience on the correctness, efficiency, and usability of current 
computational tools. 3) To identify needs in important research areas such as computer-aided control 
design and multibody simulation tools which can be dealt with by government, industry, and 
university combined efforts. 4) To strengthen cooperation between industry and university 
researchers and aid in the transfer of technology from the research level to the users. 

Each of these objectives was successfully met. Among the more than 200 conference 
participants in Oxnard were representatives from government, aerospace and computer hardware and 
software industries, and universities. The conference was led off by an opening session centered on 
a panel discussion of requirements for computational control tools and the state-of-the-art of such 
tools. The bulk of the conference consisted of thirteen sessions with a total of 107 technical papers. 
The sessions ranged from multibody dynamics formulation to parallel processing methods to control 
design and analysis tools to user environments. During the conference, eleven prominent computer 
hardware and control system design software companies displayed their products via hands-on 
demonstrations. The conference concluded with a lively panel discussion focused on future 
directions with active participation by users, researchers, hardware and software developers, and 
sponsors. 

It was confirmed at the conference that the current control design and simulation tools are a 
limiting factor in today's control design and testing and are inadequate for future needs. The areas of 
concern are: A) Control design tools break down for high order systems. B) Spacecraft simulation 
tools are too slow to be used effectively for design and testing. C) An integrated computer-aided 
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modeling-design-simulation environment is needed to improve productivity. 

This report is organized into three sections: the first describes the opening session with the panel 
discussion on requirements and state-of-the-art, the second section discusses highlights of the 
technical sessions, and the third section captures the closing session with its panel discussion on 
future directions. 


OPENING SESSION 


Sponsor Remarks 

The conference was opened with remarks from each of the sponsors. NASA was represented by 
John DiBattista, the manager of the NASA Guidance, Navigation, and Control Program, who 
indicated that he planned to make the Computational Control area one of his highest priorities for 
funding. Elbert Marsh, NSF program director for Dynamics, Systems, and Control, stressed the 
interest of NSF in supporting basic and fundamental research in science and engineering. This 
includes computational control to the extent that it addresses generic issues within dynamic systems 
and control. Terry Hinnerichs, Chief of the Control and Development Branch at AFWL, described 
future DOD space missions which will need the sort of expanded capability at which computational 
control is aiming. 

Introduction to Computational Control 

The conference chair, Guy K. Man, gave an introduction to Computational Control. He 
explained that this conference is a sequel to the effort begun at the 1987 Workshop on Multibody 
Simulation and continued at the 1988 Workshop on Computational Aspects in the Control of Flexible 
Systems. He described the recommendations from the first workshop that led to a needs assessment 
effort and the formation of a multibody simulation technology verification committee. The needs 
assessment showed that in selected areas of multibody simulation and computer aided control design, 
orders of magnitude of improvement are necessary. The progress was interrupted at this point due to 
cutoff of SDIO funding. Out of the verification effort and the needs assessment findings, the case 
was made to NASA that a new program was needed. Computational Control is that new program. 

The objective of Computational Control is to develop a new generation of articulated multibody 
spacecraft and robot modeling, control design, and simulation prototype software tools. The 
motivation behind this objective is that current tools are a limiting factor in today’s control design and 
verification, and are inadequate for future needs. 

Panel Discussion: Requirements and State-of-the-Art 

The panel discussion was organized to give computational control tool researchers, developers, 
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and users a forum to discuss their perceptions of the state of the art and future requirements. The 
panel comprised: Panel Chair Larry Taylor of NASA Langley, John Sunkel of NASA Johnson Space 
Center, Glenn Macala of Jet Propulsion Laboratory, John Breakwell of Lockheed Missiles and 
Space, Terry Hinnerichs of the Air Force Weapons Laboratory, Guy K. Man of Jet Propulsion 
Laboratory, Achille Messac of C. S. Draper Laboratories, Jim Turner of Photon Research Asso- 
ciates, and Alan Laub of University of California Santa Barbara. Most panel members gave opening 
remarks of 5—15 minutes; Alan Laub gave a full-length paper as requested. 

A brief sketch of the panelists' opening remarks follows. Larry Taylor introduced the panel 
noting that if we continue business as usual, the computational burden will be overwhelming. 

John Sunkel described NASA Space Station requirements. He described the space station as a 
multibody system. The approach to flexible dynamics on Space Station is to separate the flexible 
modes — starting at 0. 1 Hz — from the control Bandwidth of 0.01 Hz. A high fidelity flexible body 
simulator is needed for verification. 

A test-bed environment is being developed for Space Station support. It will include orbital 
dynamics, aerodynamics, gravity gradient effects, sensor and actuator models, and flexible 
multibody dynamics. An early flexible body simulation dramatizes the need to improve the speed of 
current simulations: a 141 mode model required 7-8 hours to simulate 300 seconds. 

Glenn Macala discussed the NASA effort to define computational control tool user requirements. 
User requirements are intended to illuminate the path towards efficient realization of computational 
control tools. The approach has been to first call out the basic functions of the controls engineer in 
the development of a spacecraft control system. Next, the functions of the tools needed to support 
the control engineer in his task were defined. In addition, NASA mission models were identified that 
would be good candidates to use the next generation of computational control tools. The final step is 
to compare the needed functions with the mission models to set performance requirements on the tool 
functions. 

The report "Computational Control Program User Requirements," JPL Publication 89-40, is in 
preparation. 

John Breakwell gave a view from industry. His perspective is that control engineers typically 
stick with what worked the last time. The change that he sees is that the availability of real-time 
control analysis tools is giving the analyst more ability to go into the laboratory and contribute to the 
test and verification activity. He feels that tools which reduce the gap between designers and 
implementers would be of great value. 

Terry Hinnerichs discussed Air Force space requirements. Focusing on SDI, he described a 
wide variety of space systems concepts. One example of where multibody dynamics and 
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computational control fit in was the problem of a fine pointing telescope attached to a noisy laser 
device. 

Guy K. Man outlined the results of the multibody simulation codes user survey. A total of 243 
questionnaires were sent to 78 organizations. There were 40 responses, covering 27 organizations 
and 21 codes. Data collected included documentation, welcome features, additional features desired, 
ease of use, typical applications, run time acceptability, and over 20 additional issues. The details 
will be included in a paper in the conference proceedings. 

Achille Messac discussed the Control Structure Interaction (CSI) problem-redefining it as one of 
Control Structure Integrated Design. His message was that the combination of the optimal structure 
and the optimal control does not necessarily yield the optimal system. 

Jim Turner discussed current and possible future directions for multibody dynamics software. 
He listed three technical breakthroughs which are stimulating current research: Order N algorithms, 
symbolic manipulation, and parallel processing. He also identified areas with little current capability 
which remain areas for potential growth. These included fluid/mechanical coupling, large deflection 
flexible dynamics, and event driven activities such as robot/payload interaction. 

Alan Laub gave a survey paper covering control algorithms and software. He started by agreeing 
that much of today's software either is not extensible or is inadequate to address future requirements. 
He advanced the folk theorem that methods that are good for hand calculation are typically poor for 
digital computation. 

Accuracy is seen to be dependent on: the conditioning of the problem, the numerical stability of 
the algorithm, and the specifics of the hardware and software implementation. In the area of 
modeling, problems were identified for transformations between transfer function and state space 
realizations for large order systems. Since much problem structure is lost when second order 
systems are recast as first order systems, the value of developing control and synthesis methods for 
state space models in second order form was stressed. 

Laub described the status of current control algorithms as follows: modest size (<200 states), 
dense matrices, no exploitable structure, serial computing. Suggestions for future developments are: 
use reliable components (for example LAPACK as a starting point,) create an extendible package 
rather than including everything anyone could want, and plan a team approach resulting in a 
coordinated, broad based, focused, long-term, carefully planned effort. 

Open Discussion 

Following the panel discussion, Larry Taylor opened the floor to open discussion. 

During the open discussion, Ray Gluck raised the issue of funding for computational control 
research. John DiBattista responded that there is a definite opportunity for increased funding— but 
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only if the community can clearly characterize the field as one ripe for solutions to problems of 
importance to NASA. 

John Hedgepeth addressed the issue of assuring that the results of computational control research 
get into real applications. Comparing computational control to CSI he suggested that concrete 
examples are needed to make clear where this art can help in the design of advanced performance 
spacecraft in ways that are really needed by users of the system. 

Ed Haug stressed the importance of Differential Algebraic Equations (DAE) in the simulation of 
constrained systems. Alan Laub agreed, adding that control of DAEs is much harder than control of 
ODEs. 


TECHNICAL SESSIONS 

Due to the size of the conference compared to previous computational control workshops, it was 
necessary to break the technical sessions up into parallel sessions. The individual technical papers 
are described in the conference proceedings. 

Any list of highlights would certainly omit important papers, but a few items will be mentioned 
to capture some of the flavor of the technical sessions. In the multibody formulation area, von 
Flotow addressed the issue of the need for non-linear strain-displacement relations in dynamic 
analysis-describing three options as consistent, inconsistent, and ruthless. Rodriguez outlined a 
powerful new dynamic analysis approach-'spatial operator algebra." In the parallel processing area, 
many authors discussed methods and advantages of fine and coarse grain parallelization. Slafer 
addressed the state-of-the-art in real-time hardware-in-the-loop simulation, and Howe put forward 
new numerical integration techniques which help increase simulation speed. 

A number of papers discussed dynamic analysis applications ranging from the space station to 
manipulators to fighter aircraft. Others discussed emerging integrated capabilities: Russell discussed 
ISM at Boeing, Bauer and Downing discussed INCA and ASTEC at NASA Goddard, Kumar discussed 
TREETOPS work at Dynacs, and Canfield and Hummel discussed ASTROS and MEAD at Wright- 
Patterson. 


ADDITIONAL CONFERENCE ACTIVITIES 

A banquet was held the second day of the conference with Ed Haug of the University of Iowa as 
the featured speaker. Haug took a global perspective, looking at the competitive position of the 
United States in the world. He concluded that one place that we could leverage our capabilities was 
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in expanded use of simulation technology to improve the performance and reliability of our products. 

Eleven software and hardware vendors displayed their products during the technical sessions of 
the conference. These included Applied Dynamics International, AT&T/Circuit Studios, BBN 
Laboratories, Boeing Computer Services, Cray Research Inc., Integrated Systems, Mitchell, 
Gauthier & Associates, SIKOG Inc (LEVCO), Silicon Graphics, Systolic Systems, and 3-D Visions. 


CLOSING SESSION 

The closing session was designed to bring everyone back together after two days of parallel 
sessions for two general interest papers followed by a panel discussion on future directions. 

Full-Length Papers 

The first speaker was Ed Haug of the University of Iowa, who described the concept of a 
Verification Library for multibody simulation software. The idea of a verification library is to build 
up a public domain library to store data for example problems. The intent is to facilitate the 
comparison of test results and the results from different multibody simulation codes. The library 
would contain model information, simulation results, test definition data, and processed test data. 

Three flexible multibody codes are being compared at this time: DADS, DISCOS, and CONTOPS. 
Fundamental differences between these codes include: DADS and DISCOS use absolute coordinates 
where CONTOPS uses relative coordinates, DISCOS uses Euler angles to define joints while CONTOPS 
and DADS use unit vectors. In DISCOS and CONTOPS, the user must specify a kinematically 
acceptable initial condition, while DADS will solve an optimization problem to provide the initial 
condition based on estimates of the states. These differences make the job of designing a translator 
between input data sets for these codes difficult. The complete paper discussing the above is in the 
conference Proceedings 

The next speaker was John Doyle of the California Institute of Technology. Doyle gave his view 
of the history of control theory and its future directions. His main message was that for the "post- 
modern" control era we should emphasize control as a strategy for dealing with uncertainty. 

Panel Discussion 

Guy K. Man moderated the panel discussion with the theme "future directions." The panel was 
composed of government sponsors, users, software/hardware developers, and academicians. John 
DiBattista of NASA Headquarters and Terry Hinnerichs of the Air Force Weapons Laboratory 
represented government sponsors. Government and industry users of Computational Control tools 
included: John Sunkel of NASA Johnson Space Center, Loren Slafer of Hughes Aircraft, and John 
Breakwell of Lockheed Missiles and Space. Hardware and software developers were represented by 
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Jim Taylor of General Electric, Doug Petesch of Cray Research, and Dan Rosenthal of Symbolic 
Dynamics. Academic participants included George Lee of Purdue University, Alan Laub of 
University of California Santa Barbara, and John Doyle of the California Institute of Technology. 

Guy K. Man started off the discussion by relating firsthand experience with existing tools 
leading to the conclusion that they are too slow for use for spaceflight projects. He mentioned the 
mission models currently being considered as computational control reference problems. Other 
possibilities mentioned by panelists and attendees included Lunar/Mars initiative spacecraft, space 
transportation systems, solar-electric propulsion, solar sail, national aerospace plane, SDI directed 
energy spacecraft, and tether systems. 

George Lee mentioned a related program in flight controls sponsored by the Wright Patterson 
Research Center. He had two key messages based on their experiences: use the best quality low 
level numerics package you can find, and when considering user friendliness, think of the expert 
user as well as the novice. 

Alan Laub stressed that current state-of-the-art numerical analysis software such as EISPACK and 
LINPACK were only possible due to stable long-term funding, and that year-to-year funding tends to 
lead to the taking of shortcuts. 

The issue of the interaction between the control engineer and the computer scientist in 
computational control was raised. Amir Fijany raised the issue of the importance for the controls 
and dynamics specialist to know the power of parallel processing. Similarly, Dan Rosenthal related 
from his experience the value of bringing in the talents of computer scientists at an early point. The 
flow of tool requirements from control engineers to computer scientists was discussed, and 
additional comments stressed the feedback from the computer scientists back to the control 
engineers. 

John Doyle sees a set of core control routines — comparable to computing eigenvalues of a 
matrix — that are essential to being able to solve the types of problems that we talk about, and need a 
lot of work for actual implementation. He also finds that MATLAB-type packages, while good for 
linear algebra, have completely inadequate data types for control. 

The issue of the lack of standardization in interfaces between packages was raised. This is an 
issue that many groups are concerned about. Jim Taylor mentioned that the International Federation 
of Automatic Control (IFAC) has a working group studying standardization of data structures. Terry 
Hinnerichs referred to a Russell's paper at this conference describing a AFWL sponsored effort tying 
together structural dynamics, controls, thermal and optics codes. This development is the Integrated 
Structural Modeling (ISM) program, which is to go into beta testing in 1990. 
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Doug Petesch suggested that users explore supercomputing capabilities before presuming that the 
solution required parallel processing. He also mentioned the advantage that comes from doing 
software development as well as production runs on the supercomputer. The value of a hypothetical 
teraflop machine was discussed. It was agreed that while valuable, it would not solve all the 
problems — such as numerical difficulties with higher order systems. In addition some thought that 
the same speed could be achieved at a much lower cost using parallel processing. 


Dick Vandervoort of DYNAC referred back to John Doyle's comments about uncertainty, noting 
that multibody dynamics modeling gives highly detailed models— more detailed than the control 
designer needs or wants. In fact most of it is uncertainty, and what we need is a way to model that 
uncertainty in a way that can be used in the control design 

Perhaps John Hedgepeth s comment best summed up the reason to move forward with 
Computational Control. Summarizing the comments of several speakers, he noted that we are 
proceeding ahead conservatively in control system design for planned missions, and that without 
improvements in technology — we are choosing to design only spacecraft that we already know how 
to design. 
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Introduction 

A survey of multibody simulation codes was conducted in the spring of 1988, to obtain an 
assessment of the state of the art in multibody simulation codes from the users of the codes. 
This information will be used to evaluate the need to develop future codes. If this need is 
established, it will also be used to guide the development of future capabilities. A questionnaire, 
covering 30 issues, was developed for this purpose. Sixty questionnaires were sent out to 50 
organizations. We received 40 replies from 30 organizations covering 21 simulation codes. 

This survey covers the most often used articulated multibody simulation codes in the 
spacecraft and robotics community. There was no attempt to perform a complete survey of all 
available multibody codes in all disciplines. Furthermore, this is not an exhaustive evaluation 
of even robotics and spacecraft multibody simulation codes, as the survey was designed to 
capture feedback on issues most important to the users of simulation codes. We must keep in 
mind that the information received was limited and the technical background of the respondent 
varied greatly. Therefore, this paper only reports the most often cited observations from the 
questionnaire. In this survey, it was found that no one code had both many users (reports) 
and no limitations. 

The paper is organized as follows. The next section is a report on multibody code applica- 
tions. Following applications is a discussion of execution time, which is the most troublesome 
issue for flexible multibody codes. The representation of component flexible bodies, which 
affects both simulation setup time as well as execution time, is presented next. Following com- 
ponent data preparation, two sections address the accessibility or usability of a code, evaluated 
by considering its user interface design and examining the overall simulation integrated envi- 
ronment. A summary of user efforts at code verification is reported, before a tabular summary 
of the questionnaire responses. Finally, some conclusions are drawn. 

Applications 

It is not surprising that general purpose multibody simulation codes are used to address 
spacecraft and robotics with articulated elements, because of the complexity of the equations 
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of motion. Some codes are equipped to simulate ground vehicles or aircraft. The actual 
applications include control design and verification, deployment simulation, machine design, 
impact simulation, tether simulation, explosion simulation, and human in-the-loop real time 
simulators for the Space Shuttle and Space Station. About 70% of the users reported that 
multibody codes are used for control system design and verification. The remaining 30% apply 
multibody codes for system design and dynamics studies. It is also reported that multibody 
codes are used to check other codes. For example, a rigid body code might be used to check a 
more complex flexible multibody code. 

System order exercised varies from a robot arm with less than 10 states to a Space Station 
model with 150 states. One user reported trying a model with over two hundred states with 
modes up to 1000 Hz using DISCOS. Flexible body systems usually are higher order systems 
than rigid body systems, and the simulation duration is usually shorter. Simulation duration 
does however vary according to the application. 

Execution Time 

Excessive execution time is one of the two most cited shortcomings of current multibody 
codes. This concern applies especially to flexible multibody simulations. Existing codes are so 
slow for most problems that it is impossible to use them during the design phase when quick 
turnaround is needed. Highly simplified flexible models or even rigid body models must suffice, 
even though the control system designer fully recognizes that the result may be inaccurate. 
For example, consider a 20 degree of freedom flexible system with flexible modes less than 100 
Hz. The CPU time to real time ratio for this example is 200/1 on a VAX class computer. 
As the system order becomes higher, the computational load will become much more stressing 
because the computational load scales as N 3 , for the current codes, where N is the number of 
degrees of freedom. 

It is interesting to note from the survey that the simulation durations for flexible body 
problems are usually short (seconds). Very few users are using flexible codes for long runs. 
From the data, simulation duration is seen to be inversely proportional to the system order. 
This telltale observation indicates that flexible multibody simulation codes are not being used 
extensively, because long duration simulation costs are too high. One of the respondents 
summarizes this quite well - “Multibody run times are seldom acceptable. You either pay the 
price or get approximate answers.” 

Getting an approximate answer is the only way to reduce excessive execution time for given 
hardware and software. Approximation of flexible multibody systems is done most commonly 
by reducing the order of the individual flexible body dynamics, in fact often reducing the bodies 
down to rigid bodies. More than one user reported that fast rigid body simulations were used 
to check the more complex flexible body simulations. If appropriate, the more complex flexible 
body simulation is abandoned in favor of the faster rigid body code. The model reduction 
approach has not been included in this survey. But some observations of this method are 
discussed in the next section. 
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Component Model Representation 

One of the major concerns in modeling a flexible multibody system is how to obtain data 
for the component elastic bodies. Among the flexible multibody codes surveyed, all except 
one code use the modal representation for flexible bodies. The primary benefit of using a 
modal representation is “simplicity,” which can lead to reduced computation time. There are 
a variety of modes one can use to represent the individual bodies of a multibody system, such 
as Hurty modes (Craig Bampton modes) or cantilever modes. The modes of each body have 
to be supplied to the multibody code which will then synthesize the modes of the component 
bodies to arrive at a system model for time simulation. 

The body of knowledge of choosing the best component mode set for system synthesis 
is called component mode synthesis. This includes both choosing the type of modes to use 
(possibly several) as well picking out which of the modes must be retained for simulation 
fidelity. The various component mode sets were developed for this purpose, however they were 
developed for systems with no articulating components, essentially static system geometry. 
Flexible multibody codes, on the other hand, were developed to model dynamical systems with 
varying geometry. The applicability of component mode synthesis results to multibody analysis 
is not well understood and care must be taken in each specific case to ensure that the results 
are correct. One question that needs to be asked often is whether or not a set of component 
body modes is complete over the range of interest of the articulation angles (or translations). 
After the system is synthesized in the multibody code, a check can be made against a higher 
fidelity linear model such as a NASTRAN system model only for a fixed geometry. 

It was pointed out in the previous section that model reduction is one of the commonly used 
methods to obtain an approximate solution. This is a necessary step for obtaining a reasonable 
simulation computational time. A variety of methods exist to pick out the significant modes. 
These arose out of the the slightly different model reduction problems of control theory. What 
model reduction criterion should one use to obtain an adequate approximate multibody model? 
No definitive answer exists, but the answer certainly depends on how the simulation is to be 
used. For example, if the simulation is to be used for control system design, then controllability 
and observability criteria may play a part. It may also be useful to consider the control design 
problem simultaneously with the model reduction problem. In some multibody codes the level 
of modal approximation used (for example the DISCOS mass distribution options) ties directly 
into the flexibility data preparation process, compounding the data preparation problem. 

Considering these issues, it is not surprising that the survey indicates flexible data require- 
ments are obscure, challenging, poorly treated and inadequate. Considerable research remains 
to be done in component model representation. 

User Interface 

One of the most common complaints concerning current multibody simulation codes is 
the lack of interactive model setup capability. Codes that have a friendly input format win 
praise from the users. Interactive, menu driven type preprocessors with good error messages are 
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indicated as highly desirable. Incomplete documentation of the code was the second most cited 
complaint. Sophisticated users also desire well commented source code, especially if the user 
manual is inadequate. Online manuals and documentation were mentioned by several users as 
highly desirable. For flexible multibody codes, the availability of the theoretical background on 
which the code is based is very important. In addition to documentation, application examp es 
are found to be an effective way to train users. It was also indicated that examples are needed 

for modal data preparation. 

At the output interface, data retrieval flexibility and graphics seemed to be most impor- 
tant. Users were divided on whether they wanted built-in plotting capabilities or whether they 
simply wanted the ability to export data to their own favorite plotting programs. The survey 
did not indicate any major concerns in this area. 

Integrated Environment 

Multibody codes are sometimes used as stand alone tools but more recently complex 
flexible multibody codes are often used in conjunction with other software such as finite element, 
control analysis, or optics codes. In the latter environment, the multibody code becomes 
a fundamental building block of an integrated design and analysis environment. Some of 
the codes surveyed, such as TREETOPS, are actually a microcosm of such an environment. 
This is a natural development trend, because multibody systems are becoming more complex, 
forcing system engineering work to become multidisciplinary. The survey pointed out that 
no environment can satisfy everybody. Every organization has its own culture and emphasis. 
Everybody has some pet code. It is therefore perhaps wise to treat multibody codes as modular 
building blocks which can be easily integrated into a bigger environment. If this is the approach 
we adopt, then there is a need to define a standard interface for data passage. 

There are a number of additional capabilities that users found helpful beyond the gen- 
eration and integration of equations of motion. These are: a library of joints sensors and 
actuators, the ability to obtain constraint forces, transfer functions, Jacobians, and key state 
matrices for external control analysis, the ability to incorporate user defined subroutines, and 
of course a model reduction preprocessor. 

Verification 

A number of codes have been independently checked by users with other codes. Most 
users utilize simple test cases and the principles of mechanics to check simulation results. Of 
the test cases reported, none are designed to check flexible body effects. This is not surprising, 
because if component model data generation is not well understood, the verification of the 
model must also be nontrivial. More work should be done in this area so as to add confidence 
in the simulation results and the flexible multibody codes. 

It is suggested that a set of standard test cases be collected, which can be used by the 
community to check both existing and new simulations codes. The set should include simple 
test cases with known simulation results as well as actual experiments with test data. This 
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collection process has been started by NASA and the University of Iowa. 

Tabulated Data 

The user questionnaire data has been summarized in the following tables. Table 1 gives the 
availability of the multibody code, as well as how many respondents used each code and how 
extensive the information provided by the users was. Almost all of the software is available, 
either commercially or in the public domain (COSMIC). 

Table 2 gives various details of how users used the codes, and what code features they 
liked or felt needed improvement. Typical applications and whether or not the run times were 
acceptably fast are given, as well as what component data the code required and, in general, 
how easy the code was to use. User comments are included on the quality of the documentation 
as well as what features of the code they liked, and what features could be added to improve 
productivity. Following are some comments on the individual columns. 

There was a wide range in documentation quality, from “clear and precise” to “inadequate 
overall,” which in general indicates how much care the developer took with making the code 
easy to use. 

There was very little overlap from code to code of features that the users appreciated. 
Some users liked the user interface of some codes, while one code was notable for animation, 
and another for being database driven. Two features that did show up more than twice were 
the ability to add user-defined subroutines, and a library of application modules. 

There was also a wide variety of additional features that the users desired, again with 
surprisingly few common needs from code to code. Significant needs included better documen- 
tation (mentioned for four codes), and a better data interface capability (also mentioned for 
four different codes). 

The acceptability of the simulation run times was much more uniform. The rigid body 
codes were all considered acceptably fast, while the flexible body codes were, in general, ac- 
ceptable only for small problems. 

There was considerable variety in the user comments on the type of data needed for the 
representation of flexible components, indicating the lack of maturity of this area. 

It is important to keep in mind that some multibody codes have been around for a long 
time and so have an extensive user and applications base, while other codes are new and have 
only been used by relatively few people on a small number of examples. 

Conclusions 

Examining the reports of existing multibody code users, three facts become evident: 
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1. It is difficult to use flexible multibody codes in design and analysis due to the long execution 
times for realistic problems, 

2. Representation of component flexible bodies is poorly understood, and 

3. Not enough thought has been given to the user interface by the code developers. 

At the Workshop on Multibody Simulation in 1988, JPL made a commitment to coordinate 
a multibody code verification, as part of an ongoing effort of the whole multibody simulation 
community. This survey report, as well as the verification library begun jointly with the 
University of Iowa, represent the first steps toward meeting that commitment. 
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TABLE 1 MULTIBODY SIMULATION CODES INCLUDED IN THE SURVEY 


Code 

Contact 

* of 
Reply* 

Information 

Received 

Availability 

Host 

Computer 

ADAMS 

Rajiv Rampali 
MDI Inc. 

3055 Plymouth Rd. 
Ann Arbor, Ml 48105 

3 

extensive 

yes, commercial 

• Cray * Prime 

• NEC • Cyber 

• IBM • SGI 

• Alliant • HP 

• Convex • Sun 

• DEC • Apollo 

* Intergraph 

ALLFLEX 

Jimmy Ho 

Lockheed Missiles & Space Co. 
Space System Division 
1111 Lockheed Way 
Sunnyvale, CA 

1 

moderate 

yes 

• CRAY 

• VAX 

AUTOLEV 

David Levinson 
93-30/250 

Lockheed Palo Alto Research Lab 
3251 Hanover Street 
Palo Alto, CA 94304 

1 

moderate 

yes, commercial 

IBM PC/CLONE 

CONTOPS 

John Sharkey 
NASA MSFC 
Code ED 12 

Marshall Space Flight Center 
Alabama 35812 

4 

extensive 

yes, COSMIC 

• VAX 11/750 

• MICRO VAX 

• HP 9000 

DADS 

CADSI, Inc. 

P.O. Box 203 
Oakdale, IA 52319 

5 

extensive 

yes, commercial 

• VAX • Multiflow 

• Alliant • Prime 

• Apollo • Silicon Graphics 

• Cray • IBM PC 

• DEC • Computervision 

• IBM ♦ Cyber 

• Sun • HP 


































Code 


Contact 


* of 
Replys 


DAMS 

A.A. Shabana 
P.O. Box 4348 
Dept of ME 
U of Illinois @ Chicago 
Chicago, IL 60680 

DISCOS 

COSMC 
11. of Georgia 
Computer Service Annex 
382 E. Broad Street 
Athens, GA 30602 

DYNOCOMBS 

Ron Huston 
U of Cincinatti 
Dept, of ME & IE 
Cincinatti, OH 45221-0072 

DYNA/EF3 

Jim Lawrence 
NASA JSC 
Mail Code EF3 
Houston, TX 77058 

DYNAMUS 

Farid Amirouche 
Dept of ME 
P.O. Box 4348 
U of Illinois @ Chicago 
Chicago, IL 60680 


Information 

Rocoivod 


Availability 


Host 

Computer 


limited 

yes 

IBM 

extensive 

yes, COSMIC 

• IBM 3090, 770 

• VAX 11/780, 750 

• MICRO VAX 

moderate 

yes 

• IBM 4043 

• VAX 1170 

moderate 

yes 

Gould Concept 32 

extensive 

yes 

♦ IBM 370 

• VAX 11/780 












































































Code 


MULTFLEX 



SD/EXACT 


SPASIS 


Contact 


Martin Tong 

The Aerospace Corp. 

MS (M4/972) 

P.O. Box 92957 

Los Angeles, CA 90009-2957 


COSMIC 

University of Georgia 
Computer Services Annex 
382 E. Broad Street 
Athens, GA 30602 


Dan Rosenthal 
11585 Silvergate Drive 
Dublin, CA 94568 


COSMIC 

University of Georgia 
Computer Service Annex 
382 E. Broad Street 
Athens, GA 30602 


# of 
Replys 


Information 

Received 


Availability 


Host 

Computer 


proprietary 


CDC Cyber 175 
Cray X-MP 


yes, COSMIC 



Station Control 

Simulation MailCodeEH2 

Houston, TX 77058 


John Sharkey 
NASA MSFC 
Code ED 12 

Marshall Space Flight Center 
Alabama 35812 



• VAX 11/785 

• IBM 3090 

• VAX Station II 

• Sun Workstation 

• HP-9000 

. MICRO VAX 


TREETOPS 


3 


moderate 


yes, COSMIC 







































TABLE 2 USER EXPERIENCE 


CODE 

DOCUMENTATION 

WELCOME 

FEATURES 

ADDITIONAL 
FEATURES DESIRED 

USER 

EXPERIENCE 

TYPICAL 

APPLICATIONS 

EASY TO USE? 

RUN TIME 
ACCEPTABILITY 

FLEXIBLE 
COMPONENT DATA 

ADAMS 

•VlMdoounwM 

• User's A 
Application Manual 

• Thaary Notes 

• Source not svalttbto 

• Q CAOrCAE Interface 

• Uneertution. Modal 
Analysis 

• Largs Joint Library 

• Wide Variety ot Forces 

• Accurate FlexiMIty 

• Ful Fetture R ratal 
- hnertaoe to MotrtxX 

• Poogrrooeseor wth 

• User Subroutlnee 

• Butt in control elements 

• Theory manual 

•Extensive 
{>100 Engineers 
trained at MDf) 

•Fully staffed 
hotline support 

siimurKiaa: 

• Subsystem * system 
modtotog 

• Su^aneion Dsslgn 

• RUe Analysis 

» Robotics 

• Spececrttt Simulation 

• Mechanism analysis 
anddsekm 

•Yes 

• 0 commercial 
preprocessors 
available 

• 3 high-end 
commercial graphics 
animation post- 
procosaora 

Problem 

Dependant 

•ANSYSNASTRAN 
reduced stiffness 
Matrix 

• Direct physical 
properties 

• Handies al geometric 

nonlinearities 

• Standard beam and 

field eiemanti 

ALLFLEX 

* Documentation 

• Some* not svteltela 

• Easy and test to 
aseembte simulation 

• Menu driven input 

extensive 

Spacecraft 

yes 

Problem dependant 

Format is wall 
defined 

AUTOLEV 

* Documented 

* On-line help 
convnand 

* Sourea not available 

• Vary vsraatita 
♦ PC based 

• Explicit symbolic 
aquations 

None thus far 

moderate 

• Robotic dance 
■ Spacecraft 

yes 

yes 

Handles systems of 
rigid bodies only 

CONTOPS 

• VO wall documantoc 

• Innar workings not 
woi documented 

• CodacofTwnants 
fkatehy 

• Interactive 
prsprocsssor 

• Plotting nicely dons 

• Mors help In error 
tracing 

• Modal dtta prep 
•samples 

• Mors application 

• Sensor model 

• More boundary con- 
ditiona tor flat data 

• Frequency response data 

• Expand output Mat 

extensive 

• Robots 
•RMS 

• Spacecraft 

• Robot experiment 

• Smart Structures 

yss 

Fair, not acceptable for 
flexible problems 

• More flexibility in 
flex -body 
representation 

• Flexible augmentation 
is obscure 

DADS 

• Ws« documsmod 

• Sourea not avoflafala 

• User's manual 

• Examples manual 

• CAD/CAE irttorfscec 

• Prsprocsssor 

• Control syatsm 
modeling tiarsrti 

• Pottprocossor 

• Extramaly stable 
integration 

• Mors sophisticated 
postprocessor 

• Intertecs to control 
analysis parksgs 

• Stop/restart 

• Error maaaagas 

• Control Analysis 

capability 

• Extensive 

• Fully staffed 
hotline support 

• Servo control of 
robot arms 

• Vahtde simulation 

• Mechanisms 

• Spacecraft 

i 

yes 

Problem dependant 

• Component modal 
synthesis 

• interfaces to NASTRAN. 
ANSYS, ABACUS 

• Handles ail geometric 

nonlinearitiee 

DAMS 

Sourea is avertable 
but not well 
commented 

User subroutine 
facility 

not avaiabi# 

moderate 

• Machine design 

• Robot 

• Impact 

Mixed 

Problem Dependent 

Normal modes 


ORIGINAL PAGE IS 
Of. POOR QUALITY 



















EXPERIENCE 


TYPICAL 

APPLICATIONS 


EASY 
TO USE 


RUN TNIE 
ACCEPTABILITY 


FLEXIBLE 
COMPONENT DATA 







extensive 

• Spacecraft 

• Rapid retargeting 
managers 

• Deployment 

• Propellent Noth 

• Check other oods 

• No 

• Need thorough 
understanding 
of coda 

• Problem dependent 

• Too slow tor high 
order systems 

• Normal modes 

• Any mode shape 

rnobmit 

• Towing of submerged 
cables 

• Robotic 

• Homan body modeftng 

yea 

yes 

Ftexibflity is modeled 
at jotnu 

moderate 

• Human-in-the-loop 
reel time emulator 
tor shuttie and 
space station 

• Shuttle RMS 

fthuttta/atation 

docking 

lair 

yes (reel time) 

Does not modal 
tlexibitity 

moderate 

Constraint Tree, 
Ftextblle Spacecraft 
deployment. Manipulator 

yea 

gpori 

Normal modes 
Componem moos 

analysis 

limited 

■ Spacecraft 
• Robot 

yea 

yes 

Craig* Bempton 
modes 


ORIGINAL PAGE IS 
OE POOR QUALITY 
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CODE 


DOCUMENTATION 


WELCOME 

FEATURES 


ADDITIONAL 

FEATURES 

DESIRED 







NBOD2 

• Fairty wall 
documariad 

• Source available 
and commented 

• Praprocataor coda 
lahat^M 

• Indaaing achama on baa 
and conalrainad joints 
can ba improvad 

• Simpia application 
examples daairad 

limitad 

SD/EXACT 

• Good manual 

• Sourca coda not 

availabla 

m 

m 

• Mora efficient 
mamory uaa 

• Handla flexibility 

• Graphics 

• Variabia dag raa of 
fraadom 

extensive 

SPASIS 

• WaU Documented 

• Sourca availabla 
and commaniad 

• Good documentation 

• Usar friandly 
inputs 

• Batch processing 

uelitfee 

• Extensive aarth 
orbital dynamics 

• Mora affidant Baro- 
dynamics and propaiiani 
dynamics option* 

• Animation 

• Raboost ability 

modarmta 

Station 

Control 

Simulation 

• Code documented 

• Sourca avaiabla 
and commaniad 

• Mmtiiar design 

• Sail contamad 

• Modal reduction coda 

• Faster coda 

limitad 

TREETOPS 

• Reasonably 
documantad 

* Sourca ia avaiabla 
but not Ihoroughty 

• Interactive 
preprocessor 

• Spaadup options 

• Library of 

aanaora 

• Certain Han inputs not 
fuly dafinad 

• Graphics praprocaaaor 

• Cumbarsoma output 

• Battar intarfaca with 
usar subroutine# 

modarata 


EASY 
TO USE 

RUN TIME 
ACCEPTABILITY 

FLEXIBLE 
COMPONENT DATA 

mixed 

yaa, but should ba 
improvad 

Does not modal 
flaxibility 

ym 

yaa 

Doa* not model 
flexibility 

yaa 

yea 

Doas not modal 
flexibility 

yaa 

Excasaiva whan flaxtbia 
modal used 

Outputs 2 and 4 
from NASTRAN 

usually 

Usually, high tor 
flexbody payloads 
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Abstract: This paper describes the Spatial Operator Algebra framework for the dynamics of general 
multibody systems. The use of a spatial operator-based methodology permits the formulation of the dy- 
namical equations of motion of multibody systems in a concise and systematic way. The dynamical equations 
of progressively more complex rigid multibody systems are developed in an evolutionary manner beginning 
with a serial chain system, followed by a tree topology system and finally, systems with arbitrary closed 
loops. Operator factorizations and identities are used to develop novel recursive algorithms for the forward 
dynamics of systems with closed loops. Extensions required to deal with flexible elements are also discussed. 


1 Introduction 


The field of multibody dynamics is currently being challenged in two major ways. The increase in the size and 
complexity of spacecraft systems requires the development of tools that not only help manage the complexity 
of such systems, but also facilitate the development of novel dynamics formulation techniques and solution 
algorithms. Areas such as robotics involve multibody systems consisting of multiple robot manipulators 
interacting with each other and with complex environments. These are multibody systems with not only 
constantly time-varying topological structure, but also ones in which the constituent bodies change with 
time. Coping with this aspect requires versatile and flexible dynamics simulation tools. 

In this paper, the Spatial Operator Algebra Framework [1] is used to develop a systematic procedure 
for concisely formulating the equations of motion and derive spatially recursive forward dynamics algorithms 
for multibody systems. The equations of motion of progressively more complex rigid multibody systems such 
as serial chains, tree topology systems and finally closed chain systems are developed. Operator factorizations 
and identities are then used to obtain efficient spatially recursive algorithms for the forward dynamics of 
such systems. Extensions to handle flexible link elements are also discussed. 


2 Equations of Motion 


We begin by briefly describing the coordinate-free spatial notation used throughout this paper. Given the 
linear and angular velocities v and u> t the linear force F, and moment A at a point on a body, the spatial 
velocity V , spatial acceleration a and the spatial force f in 7Z e are defined as follows: 

v±(:), .iv, /s(») 

The rigid body transformation operator € 7£ 6x6 is defined as: 


H 0 = 
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where / is a vector joining two points, and / is the cross-product matrix associated with / which acts on a 
vector to produce the cross-product of l with the vector. 4>(l) and transform spatial forces and spatial 
velocities respectively between two points on a rigid body seperated by the vector l. For a rigid body, its 
spatial inertias Me and Mo at its center of mass C and at another point O respectively, are defined as 

"< c > = ( T> 1 "<°> = mmc)*- w = ( Z ) 

where p is the vector from O to C, m is the mass of the body, and J(C) and J(O) are the inertia tensors 
for the body about C and O respectively. The reader is referred to [2] for additional discussion on the use 
of spatial notation. 


2.1 Dynamics of a Serial Rigid Multibody System 

Serial rigid multibody systems form basic subsystems from which the dynamics of more general rigid multi- 
body systems can be generated. In this section we derive the equations of motion of a serial multibody system 
consisting of n rigid links connected together by multiple dof joints. The links are numbered 1 through n 
from tip to base. We use the terms outboard ( inboard) link to refer to a link on the path towards the tip 
(base). 


The set of configuration variables for the serial chain are the collection of the joint configuration 
parameters. It is assumed that the k th joint possesses r p (k) positional dofs parameterized by the vector of 
configuration variables 6(k) (of dimension at least r p (fc)), and that its r v (k) motion dofs are parameterized 
by the r v (k) dimensional joint velocity vector /3(k). The kinematical equations which relate 9(k) to /3(k) 
depend on the specific nature of the k th joint. It is assumed for notational convenience that all the joint 
constraints are homogeneous (i.e., catastatic). H(k) is defined such that H*(k) is the 6 x r„(it) joint map 
matrix for the k th joint whose columns span the space of permissible relative spatial velocities Ay(k) across 
the joint. The complexity of t he dynamics algorithms for the serial chain is determined by the number of 

overall motion dofs M = r v (k) for the chain. The state of the multibody system is defined by the 

collection of [#(.),/?(.)] far all the joints, and is assumed known. 


Since each link is rigid, it suffices to develop the equations of motion at a single reference point 
on each link, which is taken to be the inboard joint location O k for the k th link. With V(k) denoting the 
spatial velocity, a(k) the spatial acceleration, f(k) the spatial force and T(k) the joint force at O k for the 
k ih link, the following Newton -Euler recursive equations describe the equations of motion for the serial rigid 
multibody chain: 


1 V(n 4 * 1 ) = 0, a(n + 1 ) = 0 
for k = n • * 1 

V(k) = P(k+l,k)V(k + l) + H'(k)p(k) 
a(k) = P(k + l 9 k)a(k+l) + H'(k)j3(k) + a(k) 
end loop 

( 2 . 1 ) 

m = 0 

for k = 1 • * • n 

< f(k) = <f>(k + 1 , *)/(* - 1 ) + M(k)a(k) + b(k) 

T(k) = H(k)f(k) 

end loop 


a(&) and b(k) are the velocity dependent Coriolis acceleration and gyroscopic forces repectively for the 
k ih link at Q k . <t>{k,k — 1) denotes the transformation operator from O k - 1 to O k ■ For additional details 
regarding the derivation of these equations of motion, see [1]. We have made the simplifying assumption 
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that the tip force /( 0) is zero. Attaching a full 6 motion dof joint between the physical base and the inertial 
frame allows us to easily deal with the mobile base situation. For the inverse dynamics problem, the joint 
accelerations /? are known, and Eq. (2.1) represents an 0{Af) computational process involving a base-to-tip 
recursion to compute the velocities and accelerations, followed by a tip-to-base recursion to compute the 
joint forces. 


In order to express the equations of motion given by Eq. (2.1) in a more compact form, vvc define 
the stacked notation. In this notation, the K(fc)’s, cr(fc)’s etc are viewed as components of vectors V , a etc. 
Then Eq. (2.1) can be written in the following compact form: 


e;v + H*0 
e; a + H*(3 + a 
f = £+f + Ma + b 


V = 

a = 


( 2 . 2 ) 


where, 


e+ = 


H i 


/ 


0 

*( 2 , 1 ) 

0 


0 

/ H( 1 ) 
0 


V 0 


0 

0 

*(3,2) 

0 


0 

H(2) 


0 

0 

0 


\ 

AT(1) 

0 

0 \ 

A 

0 

M(2) . 

0 

, M — 





o 

0 

. . M(n) / 


0 \ 

0 


H(n) ) 


However, since £$ is nilpotent ( = 0), 


* = (/ — £+) 1 = / + £<p + £% + 


+f ; _1 = 


/ i 

*(2,1) 


\ *(n, 1) *(n,2) 


0 \ 

0 


/ ) 


(2.3) 


(2.4) 


where, 

1 - l) * + 1 > i) 

Thus Eq. (2.2) can be reexpressed in the form, 

V = <f>* H* j3 

a = + (2.5) 

T = Hf= H<t>M<f>* H*f3+ H<j>(M<j>*a + b) 

= MP + C , where M = and C = H<j>(M<t>*a + b) 

M G is the mass matrix for the serial chain and C G 11^ consists of the velocity dependent Coriolis, 

centrifugal and gyroscopic joint forces. In the terminology of Kane’s method [3], /? are the generalized speeds 
and the elements of are the partial (spatial) velocities. 


(f) y H , and M are the first of the spatial operators that will be encountered. Recursive dynamical 
algorithms can be derived naturally by exploiting the special state transition properties [1] of the elements of 
spatial operators such as £<f>, <j> etc.. For instance, given a vector y , the evaluation of the matrix-vector product 
<f>y does not require an 0(n 2 ) matrix-vector product computation, and not even the explicit computation 
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of the elements of <£, but rather, it can be evaluated using an 0(n) recursive algorithm involving only the 
elements of and y. This is precisely the correspondence between the concise operator based high-level 
description of the equations of motion in Eq. (2.5) and the recursive algorithmic description in Eq. (2.1). 

Spatially recursive 0{M) forward dynamics algorithms for serial chains have been developed in 
[4] based on the recognition of the isomorphism between the structure of the dynamics equations and the 
equations encountered in Kalman Filtering theory. These insights have formed the basis for the development 
of the Spatial Operator Algebra Framework for multibody dynamics. 


2.2 Tree Topology Systems 


In this section, the dynamics of rigid multibody systems with tree topological structure are discussed. A 
tree topology system may be viewed as a set of component serial chains (referred to as branches) coupled 
together via joints at their terminal links. The total number of branches is denoted l. The index for the 
branches thus ranges from 1 • t, and consistent with the link numbering scheme in the previous section, 

the inboard branches are assigned indices larger than those for the outboard ones. The connectivity function 
i(k) is defined as the index of the direct predecessor branch, i.e., the inboard branch to which the k tfi branch 
is connected. The j th branch is simply denoted a predecessor branch for the k th branch if it belongs on the 
unique path from the k th branch to the base, i.e., if i p (k) = j for some integer p > 0. The joint coupling two 
branches is assigned to the outboard branch. Figure 1 illustrates the link/branch numbering convention for 
tree topology systems. 

The notation for serial chains from Section 2.1 is carried over to describe the branches in the tree 
structure, and an additional subscript is used to identify the specific branch in the system. Thus rij and A fj 
denote the number of links and the number of motion dofs respectively, while Vj , M ; , <f>j etc. denote 
the appropriate spatial velocity etc. quantities for the j ih branch. A link/joint is identified by the index 
of the branch it is on, plus its location within the branch. For instance, V{kj) (or more accurately Vj(k)) 
denotes the spatial velocity of the the k** 1 link of the j** 1 branch at its inboard joint location Q(kj). The 
overall stacked spatial velocity, acceleration etc. vectors for the tree are now denoted V, a, / etc. with 
V = [V* • ■ * V t *]* etc.. The total number of links n, and the total number of motion dofs M for the system 
are given by 

n= and Af == (2«6) 

j = 1 

Note that when the j th branch is the direct predecessor of the k th branch, i.e., j = i(k), the joint connecting 
them is the njj/* joint on the k th branch and is located on link lj on the j ih branch. The transformation 
operator from the nj* joint to the l) h joint is denoted n k ). The spatial operator €+ is now defined in 
terms of its block matrix elements below. For j,k E 


£<t>j 


for j = k 




(i 


o \ 

o o 


for j = t(k), i.e. if j is the direct predecessor branch of k 


\ 0 0 0 / 


I 0 for j ^ i(k). i.e. if j is not the direct predecessor of k 

(2.7) 

0 denotes a zero matrix of dimension appropriate for the context. As a consequence of the numbering scheme 
used here, for j < k , the j th branch cannot be a predecessor to the k th branch and thus the (j, k) th block 
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element, £<t>(j, k) = 0. Thus £$ is a strictly lower triangular matrix. The analogs of Eq. (2.2) are as follows: 




v = e;v + H'ft 




a = £ja + H*ft + a 




f = £<t>f + M a + 6 

(2.8) 

Once again (analogous to Eq. (2.4)), £+ is nilpotent (££ = 0), and so 




4> = (i-£<,)-' = i+e++ei+ ••• 

(2.9) 

The block structure of <f> is described below: 



r * <h 

o’ 

II 


Hj, k) = < 


,l if 3 p > 0 : j = t p (k) } i.e., if j is a predecessor branch of k 

(2.10) 


0 

V. 

if j zfi i p (k) V p > 0, i.e., if j is not a predecessor branch of k 


Here , /jb)} m ,/ denotes 

a block matrix whose ( mj) th entry is given by ^(m^,/*) with m £ 1 • 

• • Uj and 


l € 1 <i>{rrij,l fc) is the transformation operator from joint /* (on the k th branch) to joint mj (on the 

j th branch) and is a generalization of the transformation operator in Eq. (2.4) for serial chains. It 

is formed by sequentially composing all the individual transformation operators that lie on the unique path 
joining the two joints. The numbering scheme used here ensures that <f> will be a lower triangular matrix. 
The operator <j> has state transition properties analogous to the <f> for serial chains, and as a consequence, it 
can be used for high-level and concise description of the dynamics of tree topology systems (as in Eq. (2.11) 
below), but with the full understanding that from the computational perspective, these equations directly 
map into recursive implementation procedures. From Eq. (2.8) and Eq. (2.4) it follows that, 

V = 4>'H'P 

a = f{H*0 + a) (2.11) 

/ = + b) = <pM<p* H* ft + + 6) 

T = fff = H4>M<f>* H* ft + H<j>(M<t>*a + b) 

= Mft + C , where M = H<j>M<j>* H* , and C = + b) 

M. £ denotes the mass matrix for the tree system. T ^ T - C can be easily computed from the 

knowledge of the system state, and so the equations of motion for the system can be rewritten in the form 

M0 — T (2.12) 

The forward dynamics problem requires then the solution of the joint accelerations /? for a given set of joint 
forces T. The mass matrix for the system is typically not available and potentially needs to be computed to 
solve the forward dynamics problem. However, in Section 3, a recursive 0(N) forward dynamics algorithm 
for tree topology systems, which does not require the explicit computation of the mass matrix M, is derived. 

Before proceeding on to closed topology systems, we first derive the structure of the Jacobian 
operator. Given n c points, denoted l C* ’s, on the tree (see Figure 1), the Jacobian operator J e Jl 6ncXj ^ 
defines the mapping between /? and V , i.e., V = J/?, where V £ 1l 6nc denotes the vector of spatial velocities 
at these points. If C* is on link mj , then the spatial velocity at C* is given by 

V(k) = p(0(m j ) t C k )V(m j ) 

with <f>{0(mj),Ck) denoting the rigid body transformation operator from C* to the point 0{rrij). With the 
block elements of B £ 7 Z 6nx6rlc defined as 

( </>(0(rrij),Ck) if C k € mj* link 

5(m J »=< for k = 1 ■ ■ • nc (2.13) 

S 0 otherwise 
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it follows that 


V = B*V = 5* </»*//*/?, i.e., J = H* (2.14) 

This gives us an expression for the desired Jacobian operator which will be used below when dealing with 
loop closure constraints for closed topology systems. 


2.3 Closed Topology Systems 


This paper develops a systematic procedure for the formulation of the equations of motion and derivation of 
forward dynamics solution algorithms for general topology multibody systems with time-varying topologies 
as well as changing constituent bodies. Based on the specific application, such systems may be conceptually 
partitioned as follows: 


(a) The primary system consisting of the least time-variant part, i.e., the multibody subsystem with fixed 
topology and constituent bodies. 

(b) The secondary system consisting of the multibody subsystem which may change from time to time. 

(c) The set of closure constraints and/or boundary conditions between/within the primary and secondary 
systems which change with changes in the system topology. 

Note the the subsystems described above are in the order of increasing time-variation. As an example, let 
us examine the robotics scenario of multiple manipulators interacting with each other and the environment 
to perform complex tasks. In this context, the manipulators belong to the primary system since their innate 
structure varies very little with time. The task objects vary from task to task and form the secondary 
system. The constraints between these two subsystems change during the execution of a task, such as 
grasping, mating, tool operation etc., and belong to the last category. 

This partitioning allows us to derive a very general and yet systematic procedure for the development 
of dynamics algorithms which are responsive and adaptable to time-varying systems. The procedure involves 
a sequence of decoupled steps for each of the primary and secondary system dynamics, and one step in which 
they come together when the constraint forces are computed. Being structurally time-invariant, it is possible 
to put in place optimized algorithms for the dynamics of the primary system. The time-variant secondary 
system is typically of small complexity and thus the use of standard, though suboptimal, algorithms does 
not substantively degrade performance. 

This decomposition of the closed topology system is a departure from the more traditional approach 
(see [5, 6]) of forming a spanning tree for the full system and computing the constraint forces at the points 
of closure. In these latter approaches, even small changes in the original system typically lead to whole new 
spanning trees for the system. This disallows any algorithmic optimization, and the algorithms are also not 
very amenable to coping with time- varying systems. 

The primary and secondary systems in most applications have tree topological structure. However 
in general there may be internal closed loops within either system. In any case, by cutting an appropriate 
number of joints, each subsystem may be regarded as a tree topology system with additional kinematical 
constraints at the internal loop closure points. The equations of motion for tree topology systems derived in 
Eq. (2.12) will be used to describe the dynamics of the tree components of both the primary and secondary 
systems, with the subscripts “P” and “S” differentiating the two subsystems. Thus the dynamics of the tree 
part of the two systems are described by 

Tp = Hp<j>pMp<f)* P Hpf3p — Mpftp, and is “ — Msfts (2.15) 

Mp and Ms denote the mass matrices, 0 P and 0s the motion dof parameter vectors, 7> and T s the 
bias free internal joint forces for the primary and secondary subsystems respectively. 
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Combining the internal loop points of closure with the points of closure coupling the two systems, we 
obtain the overall points of closure for each of the subsystems. Let Vp and Vs denote the spatial velocities 
at these overall points of closure for the two systems, and following the discussion leading to Eq. (2.14), let 
J p = Bp<j>pHp and J s = B* s <f>* s Hg denote the Jacobian operators for the two systems corresponding to 
these points. Thus Vp = Jp/3p and V$ = Jsfis- The kinematical constraints due to the existence of internal 
closed loops within the primary and secondary systems leads to constraint equations of the form: 

QpVp — Up and Qs V$ — Us 


The coupling together of the primary and secondary systems via joints leeds to constraint equations of the 
form: 

QpVp + Qs Vs = Uc 

Defining 


yip = 


Qp 

Qp 

o 


As = 


Qs 

0 

Qs 


, and A == [Ap As] 


the closure constraints can be collectively expressed in the form: 


A ( it ) = [Ap - 451 ( 'o j s ) ( h ) = ,ApJp ' 4sJsl ( & ) “ ( $' ) ~ u <2 ' 16> 


It is assumed that [A P J P A s Js] is of full row rank Me- The overall number of motion dofs of the closed 

chain system is given by Me = M + Ms - Me, and if necessary, Eq. (2.16) can be used to find the Me 
dimensional minimal set of motion generalized coordinates for system. Based on the principle of virtual 
work, Eq. (2.16) implies that the closure constraint joint forces are of the form 


( J P A P 
\ Js-^s 


) 


f 


for some / € . This leads to the following overall equations of motion: 

( Mj> 0 J P A P \ / (3 P \ / T P \ . . f V P 

0 Ms I 05 ) » f* , where U = U- [( A P J P ) (. A S J S )] ( y 

\ A P J P AsJs 0 ) \ S ) \ V ) 



/ Mv 

0 

JpAp \ 

? p \ 

( fp \ 

=> 

0 

Ms 

J'sA's 


, Ts ) 


\ 0 

0 

— [ApApAp + A 5 A 5 A 5 ] J 

V / ) 

\ U — [Ap JpMp l Tp 4- AsJs T 5 ] / 


( 2 . 17 ) 


where 

A P = J P M P l J P , and A s - JsM^Js 

Note that Ap and A 5 are the effective “admittances” of the primary and secondary systems reflected to the 
points of closure. We now describe some special cases of the above setup: 


• The joint constraints coupling the primary and secondary systems are typically on the relative spatial 
velocity across the joints at the points of closure. When this is true for all joints, an appropriate 
reordering of the elements of V will result in Q P = -Qs- Furthermore, if no relative motion is 
permitted across the joint, i.e., there is rigid rather than loose coupling, then in fact Q P - I and 
Q s = 1. When this is the case for only some of the joints, only the corresponding rows have these 
special features. 

• If the secondary system has no internal actuators or source of generalized forces, then Ts = 0. 

• If the secondary system is a free rigid body with no internal degrees of freedom, then the motion 
generalized coordinates vector /3s is of dimension 6 and consists of the 3 translational and 3 rotational 
dof parameters. 
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3 Forward Dynamics of Closed Chain Systems 


In this section we discuss a recursive method for solving the forward dynamics of closed chain rigid multibody 
systems. This method does not require the explicit computation of the primary and secondary tree system 
mass matrices Mp or Ms, hut does require the computation of the constraint force parameters. 

From the equations of motion of the closed chain system with dynamical closure constraints given 
by Eq. (2.17), the solution of the forward dynamics problem can be solved by the following sequence of steps: 

(A) Solve M P p s - T P for 0 f p Solve M s p } s = T s for /?£ 

(B) Compute dtp = Jpfcp Compute d£ = JsP$ 

(C) Compute A p - JpMp l Jp Compute As = Js-Mg l Jg 

(D) Solve [ApApA'p + A s AsA* s }f = {A P a J p + A s a I s )-U for f 

(E) Solve MpP 6 p = —JpApf for (3 S P Solve MsP S s = for 4s 

(F) Pp = P } p + P S P Ps=% + P 6 S 

As a result of the partitioning, a changes in the closure constraints only effect A and thus only STEP 
D, while changes in the secondary system effect only the steps in the right half column. Recursive algorithms 
for carrying out each of these steps are derived below. The proofs of the various lemmas are omitted due 
to space limitations. However they follow precisely along the lines of the proofs for serial chains discussed 
in [7]. The explicit use of the subscripts indentifying the primary /secondary system is dropped (except for 
STEP (D)) since the discussion is equally applicable to either subsystem. 

STEP (A) Solve MP* = T. (Forward Dynamics of a Tree Topology System ) 

Note that Step (A) is equivalent to solving the forward dynamics of a tree topology system, and we 
develop an 0(N) recursive algorithm for this solution. This algorithm is based on a new factorization 
of the mass matrix M in terms of square factors, which may be contrasted with the earlier non-square 
factorization in Eq. (2.11). This square factorization is then used to obtain an explicit expression for 

M~K 

The articulated body inertia matrix P is defined as the solution to the following equation: 

M~P- £+[P - PH*{HPH*)- l HP]£; (3.1) 

P is block diagonal and the elements on the diagonal (denoted P(kj)) can be obtained using a recursive 

algorithm described in Eq. (A.l) in Appendix A. Physically, P(kj) is the articulated body inertia as 

seen at the kj h joint, i.e., it is the effective inertia of all the links outboard from the k* h joint assuming 
that the joint forces at all the outboard joints are zero. 

For the subsequent development, it is convenient to define 

D = HPH* , G = PH*D~\ K = E^G 

T = GH, T= I-T, £+= £jf (3.2) 

Note that D , G , r and r are all block diagonal. The structure of £$ is identical to that of £$ with its 
elements being given by 

— 1 ) = <j>(kj,kj — l)r(kj — 1 ) 
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(3.3) 


Sxp is also nilpotent (££ =0), and analogous to <f>, \j> is defined as 

t/> = (I - Si,)- 1 = I + £* + £$ + +£;-' 

The structure of rp is very similar to that of <j> and it also possesses the state transition properties which 
are used to develop recursive algorithms. <f> may be viewed as the transformation operator for composite 
bodies (i.e., as if all the joints are locked), while ^ is the transformation operator for articulated bodies 
(i.e., as if all the joint forces were zero). The following lemma yields a square factorization of M . 

Lemma 1: The mass matrix M has the following factorization: 

M = [I + H<pK]D[I + H<f>K}*, (3.4)1 

The following lemma gives the explicit form for the inverse of [/ + HtpK]. 

Lemma 2: 

[7 + H4>K}- 1 = [7 - HxpK] (3.5) I 

Combining Lemma 1 and Lemma 2 leads to the following form for the inverse of the mass matrix. 


Lemma 3: 


AT 1 = [7 - fht>K]*D- l [I - Hj>K] 


(3.6)1 


Thus, 


ft = M~ l t = [7 - 77V’7t]*D -1 [7 - Hi>K]T (3.7) 

The 0(N) recursive computation of the expression on the right is given in Eq. (A. 2) in Appendix A. 


STEP (B) Compute 6,1 - jft 

From Eq. (2.14), 6,1 = B*o,l , where 

ft = ft H* ft (3.8) 

However we have that, 

Lemma 4: 

(7 - HipK)H<l> = Hip (3.9) I 

Thus using Eq. (3.6) and the above lemma in Eq. (3.8), 

6,1 = ftH'[I - HipK\* D~ l [I - HipK]T = ftH'D~ l [I - HiPI<)f 

Comparing this with Eq. (3.7) we see that 6,1 can be evaluated as an intermediate quantity in the 
0(A7) recursive algorithm for computing ft described in STEP (A). 


STEP (C) Compute A = 

Using Eq. (2.14) and Eq. (3.6), 

A = {[7- HipK]H<pBYD~ l {[l - HipK]H4>B} 

= B* ft H‘ D~ l HipB — B*QB, where n = ftH'D~'HxP (3.10) 
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where Eq. (3.9) has been used to simplify the above expression. A recursive 0(M) procedure for the 
computation of Q is given in Eq. (A. 5) in Appendix A. Note that without the simplification resulting 
from the use of Eq. ( 3 . 9 ), the computation of A would be an 0(M 3 ) process. 


STEP (D) Solve [ApAA'p + A s A s A‘ s ]f = ( A P a f p + .4s«£) - U for / 

Now, 

f = [ApAA‘ P + A s AsA t s }- l [(Apa i P + A s a f s )-U] (3.11) 

In this form this step is of 0(M P ) complexity. However, when (ApApA* P ) is invertible, we can obtain 
an alternative expression for / by reexpressing Eq. (2.17) as follows: 

( Mp 0 r P A' p \ ( b \ ( Tp \ 

0 Ms J'sA's ]\ 0S = fs 

\ 0 AsJs ~A P ApA P / \ f / \ U — Apa p / 

and consequently, 

( M P 0 J' P A* P W b \ 

0 Ms+JsA's(A P AA* p )~ l AsJs 0 /?s = 

\ 0 AsJs —A P ApA P / \ f / 

( f s + J’sA'siApApA},)- 1 ^ - A P a f P ] | 

\ U - Apa p ) 

From the above equation it follows that 

h = [MsAJsA m s(ApApA- p y l AsJs]- 1 [Ts-JsA*s(ApAA},)- l (Apa f P -U)] 

f = {ApAA' P )- l [{A P a 5 P + A s Js0s)-U] 

= {A P ApA' , p )~ l [(A P a s p A s a s ) -U], where a s = J S /3 S 

Note the similarity between the forms of Eq. (3.11) and the above equation for /. The computational 
cost of the above operation is a combination of the cost of inverting ApApA* P) and the 0(Af$) step of 
solving a square linear system of equations of size Ms • The cost of inverting ApApA p depends on its 
structure: its sparsity reflects the degree of coupling between the closed loops in the system. The cost 
is typically much less than the worst case of 0(M P ). In many application domains such as robotics, 
ApApAp is in fact block diagonal and is thus invertible in 0(Me ) steps [1]. In addition, for most 
applications Ms Mp, and this new formulation can lead to considerable computational savings. 

The inverse of [ApApAp + A 5 A 5 A 5 ] will not exist if [ApJp AsJs] is not of full rank, i.e., the 
configuration is such that the number of motion dofs for the system have changed. It is therefore 
necessary to reformulate the constraint equation Eq. (2.16) so as to preserve the full rank property. 
Such changes of rank can occur at kinematically singular configurations. 


STEP (E) Compute (3 6 = -M~ l J*A*f 

We have from Eq. (3.6) and Eq. (2.14) that 

p 6 = -[I - Hxl>K]*D- x [I - HipK]H<f>BA* f 


Using Lemma 4 this simplifies to 

0 s = -[/ - HipKYD" 1 H ipBA* f 

The recursive O(M) implementation of the above step is given in Eq. (A. 6 ) in Appendix A. 


( 3 . 12 ) 
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The overall complexity of this spatially recursive forward dynamics algorithm ranges between 0(Af+ 
Ms) 4- 0{M%) for the worst case and 0{M + Ms) + 0 (Me) + 0(M§) in the best case. 

By treating the primary and secondary system as one system, which amounts to defining the quan- 
tities ip = diag(ipp,tps) , H = diag(Hp f Hs) etc., and using the above results, the overall closed topology 
forward dynamics algorithm can be restated in the following form: 

/?= [I - HrpK) m D-t -b(A\A*)- l b'^D-*[l - Hi!>K]T, where b=HxpBA * (3.13) 

Note that when there are no closed loops in the overall system, >1 = 0, and the middle term reduces to /, 
and we recover the form for for the forward dynamics of tree topology systems in Eq. (3.7). 


4 Flexible Multibody Dynamics 


In this section we briefly describe the extensions to handle the case of flexible links. We use the serial chain 
discussed in Section 2.1 as an illustrative example, but now assume that the links in the chain are flexible. 
It is assumed (without losing any generality) that finite element models are available for all the links, and 
in particular, the k th link is characterized by: n* node points with the location of of the j th node denoted 
Q*(j)’s, the vector of displacement variables E 72 6nfc , a free-free mass-matrix E 'Jl 6nkX6nk 1 a stiffness 
matrix K k E 72 , 6n * x6n *. The ordering of the nodes is such that Qjb(l) is on the same element as O k ~i and 
Qk( n k) is on the same element as Ok ■ Treating the k th link as being pinned at Ok , this implies that 
u£(n*) = 0, and thus the true flexible dofs are given by the vector u k = where h^[I,0]. Note that 

u k E Te 6 ^*- 1 ) and h k E 72 6 "* x6 ( n *- 1 ). 


With U(ib) denoting the spatial velocity of the k th link at O k , 

V(k) = </>*(k + 1, k)V(k + 1) + H*(k)fi(k) + <t>* (Qi+^l), Ofc)wfc+i(l) 

= 4>'{k + 1, k)V(k + 1) + H'{k)0(k) + C' k+1 h* k u k+1 (4.1) 

where CJ +1 = [f (q* + 1 (1),0*), 0, • • • 0] G ft 6 * 6 "* (4.2) 


Thus, 


V = <p*[H*0 + C m h*u] 


with C defined as the block matrix with C 2 to C n along its first block subdiagonal, and h is the block 
diagonal matrix with j th block diagonal element being hj . Let V[ E 72 6n * denote the vector of spatial 
velocities on the k ih link at the n* node points. Then 


Vi 

V* 


BlV(k) + hlu k , where B k = [<j>{O k , - • ■ A{O k , Qk(n k ))] G ft 6x6 "‘ 

B'V + h’u = B'<j>'[H'p + C'h'u] + /i*u = B't'H'p + [/ + B>*C*]h*u 

+ B'H')X, where X = ( “ ) (4.3) 


B denotes the block diagonal matrix with the 5*’s along its diagonal. We have used the facts that, 


B k C k ~ <p(k t k-l) => BC^S* => BC<p = <p- I => B[I + C<pB] = <f>B 

Note that X is the vector of motion dofs for the serial links and includes both the rigid and flexible dof 
parameters. 


Using Eq. (4.3), the kinetic energy for the whole chain is given by 
T = |(V' / )*mC / = | X*M J X , where M 1 = f ) [/ + C<j>B)m[I + C<t>BX 


HB 

h 
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M s is the mass matrix for the flexible serial chain. Given this factored form for the mass matrix, similar 
techniques to those used for the rigid multibody case in the earlier sections result in alternate factorizations 
and inversion of the mass matrix, and recursive forward dynamics algorithms. The reader is referred to [8] 
for additional details. Just as for the rigid multibody case, the algorithms for flexible serial chains directly 
extend to the flexible general topology multibody systems. 


5 Conclusions 


This paper describes the Spatial Operator Algebra Framework for the dynamics of general multibody systems. 
Based on their rate of time-variation, the multibody system is partitioned into a primary subsystem, a 
secondary subsystem and the set of closure constraints. This allows the development of forward dynamics 
algorithms which are not only recursive and efficient, but also capable of easily coping with time varying 
multibody systems. The solution procedure consists of a sequence of steps on parallel paths involving the 
dynamics of the spanning trees for the primary and secondary systems. The two paths come together for 
one step in order to compute the constraint forces. Using the spatial algebra techniques to develop novel 
factorizations of the mass matrix and operator identities, efficient recursive algorithms for carrying out each 
of these steps is developed. The overall algorithm does not require the computation of the mass matrix, and 
its complexity is linear in the number of dofs for the tree systems. In addition, the impact on the complexity 
of the algorithm, of the degree of coupling among the closed loops in the system topology is made clear, 
and it is shown that the in the best circumstance, the algorithmic complexity is also linear in the number 
of closure constraint equations. During the development, an 0{Af) forward dynamics algorithm for tree 
topology systems is also developed. For the sake of clarity, the focus of much of the paper was on multibody 
systems with rigid links. However the extensions necessary to deal with flexible elements are discussed. 
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A Appendix 


Based on the special structure of <f> t tp etc., it is possible to evaluate many of the dynamical expressions 
in a recursive manner and we describe some recursive algorithms in this appendix. First we define some 
notational shorthand to simplify the description of the algorithms that follow: 


x(tlj + 1) 
y(nj + l.Bj) 

y(^i i Oj)x(Oj) 

y(l;-0 j )x(0 j )y*(lj,0j) 


y(l. (».";) 

Y2 y(l j,nm)x(n m ) 
m € * ” 1 0 ) 


Y2 y( 1 i» n mMn m )y*(li,Wm) 

*»€«“*(>) 


where y(.,.) and x(.) stand for some appropriate arrays. Thus wherever a term with indices as in the left 
column appears, its meaning is actually given by the corresponding term in the column on the right. !pr 


• A recursive method for the computation of the block diagonal elements of P as defined by Eq. (3.1) 
and the entries of D, G, K } £ $ and r defined in Eq. (3.2) are given by: 


for j = 1 • • t 

If r 

for k = 1 : ■ • 

pU 

m 

G(k) 

T(k) 

xp(k + 1, k) 
K(k + 1, k) 
end loop 
end loop 


l (j) = 0, then P(0,) = 0 

n 5 

rp(k,k — l)P(k — \)ij>*(k,k — 1) + M(k) 
H(k)P(k)H*(k) 

P(k)H*(k)D~ 1 (k) 

I - G(k)H(k) 

<t>(k + l,k)T(k) 

4>(k+l,k)G(k) 


(A.l) 


• The recursive computation of f}J = [/ — HipK]*D *[/ — HipK]T in Eq. (3.7) in STEP (A) can be 
carried out via the 0(N) tree topology forward dynamics algorithm described below. It also results in 
the computation of a 5 =ip*D~ 1 [I - Hil>K]T required in STEP (B) as an intermediate quantity. 


for j = 1 • • • l 

' If * -1 (i) = 0, then z(0j) = 0,T(0,) = 0 

for k = lj • • • nj 

< ) z(k) = rP(k,k-\)z(k-l) + K{k,k-l)T(k-l) 

] c(Jb) = T(k)-H(k)z(k) 

= D~ l (k)((k) 

end loop 
, end loop 
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a f (nt + 1 ) = 0 
for j = l ■ ■ ■ 1 


for k = nj ■ ■ • 1 j 


a f (k) = V’*(* + l,*)d / (*+l) + //*(*Mfc) 
0f (k) = v{k)~ K*(k + l)at(k+ 1) 
t end loop 
end loop 


(A. 2) 


STEP (C) requires the computation of A = B*fl5. In order to obtain a O(.V) recursive scheme for 
the computation of fl we first define the matrix T as the one satisfying the equation: 


H*D~ l H = T-5;T^ 


(A.3) 


T as defined above is a block diagonal matrix and its elements can be computed recursively. We now 
obtain the following decomposition of fl. 


Lemma 5: 


0 = T + ^T + T 


(A. 4) 


Noting that ij> is strictly lower triangular, we can then recognize that T as nothing but the diagonal 
elements of fl. We now present a recursive scheme to compute the block diagonal elements of T and of 
fl. 

T(n* + 1) = 0 
for j = t -1 

for k = nj • lj 

J(k) = 'P m (k+l,k)r(k+l)Mk+l,k) + H*(k)D- 1 (k)H(k) 

fl(M) = T(ifc) 
for m = k — 1 1 j 

= Q'(m,lt) = T(l:,ra+l)^(ra+l,m) 

end loop 
end loop 
end loop 

The above recursion yields the elements on the block diagonal of Since Q, is symmetric, the 
off diagonal elements satisfy = can be computed from the diagonal elements as follows. 

Q/j for / G 1 • • * (j — 1) can be obtained via the following recursive scheme: 

f if t p (l) = j for some p > 0 
fork = nj lj 


for ni — nj • lj 

Q(k,m) = n*(m, k) = Q(k } 
end loop 
end loop 

else 

fly, = fl,V = 0 

l end if 


(A. 5) 
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• The 0(M) recursive implementation of (3* = -[/ - H1>K)* f in Eq. (3.12) in Step (e) is 
given below: 




Define x = — BA* f 
for j = 1 • / 

f If i~ l (j) = 0 , then z(0j) = 0,x(0 ; ) = 0 

for k = 1 j • ■ • nj 

z(k) = 1>(k,k-l)z(k-l) + K{k, k-l)z(k-l) 

4 e(*) = ~H(k)z(k) 

«/(*) = D~ l (k)((k) 

k end loop 
end loop 


a(n t + 1) = 0 
for j = t ■ ■ * 1 

I for k = n: • 1: 

a(Jfc) = 1> m {k + l,k)a(k + \) + H*{k)v(k) 

/3 4 (Jb) = + + 

end loop 
k end loop 


(A.6) 



Figure 1: Illustration of linlc/branch numbering convention for multibody system 
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H.M. Chun, J.D. Turner, and H.P. Frisch 
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Abstract 

The ability to simulate and analyze the dynamics of complicated 
multibody systems has been of great benefit to engineers for the past 
decade. Applications have included robotics, land vehicles, and spacecraft. 
However, many of the commercially available software have been 
computationally intensive and are costly and time-consuming for analyzing 
large systems. Fortunately, recent developments in Order (n) algorithms 
and parallel processing for multibody dynamics simulation have drastically 
reduced the computer time needed to simulate systems involving many 
bodies. 

This paper presents the formulation of an Order (n) algorithm for 
DISCOS (Dynamics Interaction Simulation of Controls and Structures), which 
is an industry-standard software package for simulation and analysis of 
flexible multibody systems. For systems involving many bodies, the new 
Order (n) version of DISCOS is much faster than the current version. Results 
of the experimental validation of the dynamics software are also presented. 
The experiment is carried out on a seven-joint robot arm at NASA's 
Goddard Space Flight Center. 

The algorithm used in the current version of DISCOS requires the 
inverse of a matrix whose dimension is equal to the number of constraints 
in the system. Generally, the number of constraints in a system is roughly 
proportional to the number of bodies in the system, and matrix inversion 
requires 0(p 3 ) operations, where p is the dimension of the matrix. The 
current version of DISCOS is therefore considered an Order (n 3 ) algorithm. 
In contrast, the Order (n) algorithm requires inversion of matrices which 
are small, and the number of matrices to be inverted increases only linearly 
with the number of bodies. 

The newly-developed Order (n) DISCOS is currently capable of handling 
chain and tree topologies as well as multiple closed loops. Continuing 
development will extend the capability of the software to deal with typical 
robotics applications such as put-and-place, multi-arm hand-off and surface 
sliding. 
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Abstract 

Multibody system equations can be generated in various forms. All of 
these may be interpreted as results of two basic approaches, the 
augmentation- and the elimination-method. The former method yields the 
descriptor form of the system motion, a set of differential-algebraic 
equations (DAE), and the latter the state space representation, a minimal set 
of ordinary differential equations (ODE). Both of these methods are 
surveyed. Particular emphasis is on the discussion of recursive 
computational schemes, generating the equations of motion with a number 
of operations, which is proportional to the number N of system bodies 
(CXAD-formuiations). 

For simulation purposes one would like to create that set of system 
equations, which can be generated most efficiently and for which the most 
efficient and reliable solution techniques are available. Numerical solution 
techniques for ODE have been studied in great detail and they are well- 
developed. By contrast, DAE have not been investigated for such a long 
time. In view of new developments in the latter field the generation of all 
the equations required for an efficient and reliable solution of DAE 
describing multibody system motion is discussed. These methods, i.e. an 
0(N)-formulation and new techniques for solving DAE, are implemented in 
the SIMPACK code. Its capabilities are illustrated by simulation of 
multibody robot models. 
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ABSTRACT 

This paper summarizes the formulation of a method known as the joint coordinate method for 
automatic generation of the equations of motion for multibody systems. For systems containing 
open or closed kinematic loops, the equations of motion can be reduced systematically to a 
minimum number of second order differential equations. The application of recursive and 
nonrecursive algorithms to this formulation, computational considerations and the feasibility of 
implementing this formulation on multiprocessor computers are discussed. 

1. INTRODUCTION 

In the past decade, the joint (or natural) coordinate method has been implemented in formulating 
the equations of motion. The methodology for open loop systems is well developed in a variety of 
forms [1-5]. For these systems, the method yields a minimal set of equations of motion since the 
joint coordinates are independent. The joint coordinates are no longer independent when closed 
kinematic loops exist in a system. For multibody systems containing simple closed loops, constraint 
equations between joint coordinates may be derived easily. However, for most spatial closed loops, 
the derivation of these constraints can be rather complicated. A simple method for automatic 
generation of the closed loop constraints, and a technique to generate a minimal set of differential 
equations of motion has been reported in reference [6]. 

This paper briefly describes the method of joint coordinates for the systematic generation of 
the equations of motion for open and closed loop systems. These equations are shown to be 
suitable for recursive and nonrecursive algorithms, serial or parallel processing, and symbolic 
manipulation. While the discussion principally focuses on multi-rigid-body systems, the assumption 
of rigidity may be relaxed by introducing the finite element technique in modeling the deformation 
of bodies. This formulation has been incorporated in a set of large-scale computer programs which 
have been used extensively to simulate the dynamic behavior of a variety of multibody systems. 

2. EQUATIONS OF MOTION 

The equations of motion for a multibody mechanical system can be expressed in different forms 
depending upon the choice of the coordinates and velocities that describe the configuration and 
motion of the system. In the following subsections, the equations of motion are first expressed in 
terms of absolute coordinates and velocities of the bodies in the system. Then these equations are 
reduced to a minimal set of equations for open-loop systems. Furthermore, the equations are 
generalized for systems containing closed kinematic loops. 
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2.1 Absolute Coordinate Formulation 

In order to specify the position of a rigid body in a global non-moving xyz coordinate system, it is 
sufficient to specify the spatial location of the origin (center of mass) and the angular orientation of a 
body-fixed £nc coordinate system as shown in Fig. 1. For the ith body in a multibody system, vector 
qj denotes a vector of coordinates which contains a vector of Cartesian translational coordinates r\ 
and a set of rotational coordinates. Matrix Aj represents the rotational transformation of the SnCj 
axes relative to the xyz axes. A vector of velocities for body i is defined as Vj, which contains a 
3-vector of translational velocities fj and a 3-vector of angular velocities u>j. The components of the 
angular velocity vector wj are defined in the xyz coordinate system rather than the body-fixed 
coordinate system. A vector of accelerations for this body is denoted by Vj, which contains t\ and Wj. 
For a multibody system containing b bodies, the vectors of coordinates, velocities, and accelerations 
are q, v, and v which contain the elements of qj, Vj, and vj, respectively, for i = 1, ..., b. 

The kinematic joints between the bodies can be described by m-independent holonomic constraints 
as 


*(q) = 0 (1) 

The first and second time derivatives of the constraints yield the kinematic velocity and acceleration 
equations 

# = Dv SB o (2) 

¥ = Dv + 6v as o (3) 

where D is the Jacobian matrix of the constraints. The equations of motion are written as [7] 

Mv - D^X = g (4) 

where M is the inertia matrix containing the mass and the inertia tensor of all bodies, X is a vector of 
m Lagrange multipliers, and g = g(q, v) contains the gyroscopic terms and the forces and moments 
that act on the system. The inertia matrix is not a constant matrix since the rotational equations of 
motion are written in terms of the global components of the angular accelerations. The term D T X in 
Eq. 4 represents the joint reaction forces and moments. Equations 1-4 represents a set of 
differential-algebraic equations of motion for a multibody system when absolute coordinates are used. 
These equations will have the same form whether the system is open- or closed-loop. 

2.2 Joint Coordinates and Open Loop Systems 

The constrained equations of motion expressed by Eqs. 1-4 can be converted to a smaller set of 
equtions in terms of a set of coordinates known as joint coordinates. For multibody systems with open 
kinematic loops, this conversion yields a set of differential equations equal to the number of degrees 
of freedom. We may consider one branch of an open-loop system as shown in Fig. 2. The bodies are 
numbered in any desired order. The relative configurations of two adjacent bodies are defined by one 
or more so-called joint (or natural) coordinates equal in number to the number of relative degrees of 
freedom between these bodies. For example, 0|j contains two angles if there is a universal joint 
between bodies i and j, or it contains only one translational variable if there is a prismatic joint 
between the two bodies. The vector of joint coordinates for the system is denoted by 0 containing all 
of the joint coordinates and the absolute coordinates of a base (reference) body if the base body is 
not the ground (floating base). Therefore, vector 0 has a dimension equal to the number of degrees 
of freedom of the system. The vector of joint velocities is defined a # s 0, which is the time derivative 
of 0. It can be shown that there is a linear transformation between 0 and v as [1-4] 

v = B0 ( 5 ) 
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Figure 1 Body-fixed and global coordinate systems. Figure 2 An open-loop system. 

Matrix II is orthogonal to the Jacobian matrix D. This can be shown by substituting Eq. 5 in Eq. 2 to 
find DB8 =0. Since 8 is a vector of independent velocities, then 

DB = 0 


The time derivative of Eq. 5 gives the transformation formula for the accelerations, 

v = B$ + B8 

Substituting Eq. 7 in Eq. 4, premultiplying by B T , and using Eq. 6 yields 

Mb' = f <«> 


where 


m = b t mb W 

f = B T (g - MB8) ( 10 ) 

Equation 8 represents the generalized equations of motion for an open-loop multibody system when the 
number of the selected coordinates is equal to the number of degrees of freedom. 

2.3 Joint Coordinates and Closed-Loop Systems [6] 


The equations of motion for open-loop systems, represented by Eq. 8, can be modified for systems 
containing closed kinematic loops. Assume that there is one or more closed kinematic loops in a 
multibody system, such as the closed-loop shown in Fig. 3(a). In order to derive the equations of 
motion for such a system, the closed-loop is cut at one of the kinematic joints in order to obtain an 
open-loop system as shown in Fig. 3(b). For this cut system, which will be called the reduced system, 
the joint coordinates are defined as for any open-loop system. It is clear that if we now close this 
system at the cut joint(s), the joint coordinates will no longer be independent, i.e., for each closed- 
loop there exist one or more algebraic constraints between the joint coordinates of that loop. 

The constraint equations for the closed kinematic loops may be expressed implicitly as 
T(8) =0 ( 11) 

The time derivative of the constraints yields 

f = ce = o 
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( 12 ) 





Figure 3 A system containing a closed-loop, (a) The closed-loop, 
(b) Its reduced open-loop representation. 


where C is the Jacobian matrix of these constraints. Similarly the acceleration constraints for the 
closed-loops are written as 


¥ = c&’ + £e = o 


(13) 


Now the equations of motion of Eq. 8 are modified for closed loop systems as 
Mb* - C T v = f 


(14) 


where v is a vector of Lagrange multipliers associated with the constraints of Eq. 11. Equations 
11-14 represent the equations of motion for a multibody system when the number of selected joint 
coordinates is greater than the number of degrees of freedom of the system. 


It is shown in [6J that the Jacobian matrix C in Eqs. 12-14 can be obtained systematically. If the 
Jacobian of the constraints for the cut-joint(s) is denoted by D*, then the product of this matrix and 
matrix B yields the C matrix as 


D*B-C (15) 

The product D*B, in most cases, will have redundant rows. Matrix C is found after the redundant rows 
are eliminated. Since the elements of both matrices D* and B are available explicitly in symbolic 
form, the elements of C can also be determined symbolically. Note that C is a small matrix in 
comparisson with D and B. Furthermore, since the elements of C can be determined symbolically, 
the term Co in Eq. 13 can also be found symbolically. 


2.4 Minimum Number of Equations of Motion for Closed-Loop Systems 

For a multibody system containing closed kinematic loops, the Lagrange multipliers of Eq. 14 can be 
eliminated in order to obtain a minimal set of equations of motion in terms of a set of independent 
joint accelerations. For this purpose, a set of independent joint coordinates are selected as a subset 
of vector 0. Then a velocity transformation matrix E can be found such that [6] 

» = Ee (i) (16) 

• 

where 0(j) is the vector of independent joint velocity with a dimension equal to the number of degrees 
of freedom of the system. Note that the joint velocities outside the closed-loops and the independent 
joint velocities within the closed-loops are not affected by this conversion. The matrix E is 
orthogonal to the C matrix; i.e., 


CE = 0 

The time derivative of Eq. 17 gives 


(17) 
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(18) 


S = E & (i) + Ee (i) 


Substituting of Eq. 18 in Eq. 14, premultiplying by E^, and using Eq. 16 yields 


M'S(i)=f 


where 

M' = E t ME ( 2 °) 

f = E T (f - ME8 (j) ) ( 21 ) 

Equation 19 represents the minimum number of equations of motion describing the dynamics of a 
multibody system containing closed kinematic loops. 

Matrix E can be found in either explicit form or in numerical form for most closed kinematic 
loops. For this purpose, we can partition vector 8 into dependent and independent sets, 0(d) and 
0(jj, and correspondingly we can partition matrix C into two submatrices C(d) an( ^ ^*(i)* Therefore, 
Eq. 16 becomes 

C (d)»(d) +C (i)®(i) =° 

This equation can be written as, 

• "1 • 

°(d) = ' c (d) c (i)9(i) 


-Cfd) c (i) 


where a proper selection of the independent joint velocities guarantees that is a nonsingular 

matrix. Comparing Eqs. 16 and 22 yields an expression for E as 


E = 1 (23) 

_ _C (d) C (i). 

Since the elements of matrix C are available explicitly, it may be possible to find the elements of E 
explicitly. This is due to the fact that the operation of Eq. 23 is performed separately on each 
closed-loop, and in addition, the C matrix for a closed-loop is relatively small in practical 
applications. 

• 

Equation 21 states that for evaluating the equations of motion, in addition to matrix E, matrix E is 
also neecfpd. An apparent approach is to evaluate the time derivative of Eq. 23. However, since the 
product EB(i) is needed in Eq. 21, we can evaluate this product directly. For this purpose, Eq. 13 is 
written in a partitioned form as 

c (d) ff (d) + C (i) b ’(i) = -< " 8 
This equation is then rearranged as 

ff (d) = - c (d) 1<: (i) ff (i) ' C (d) V« 
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(24) 


or. 


ff = 


L^(d)" 1c (i)j 


*0) + 




L“^(d) 


Co 


Comparison of Eqs. 18 and 24 yields 




(25) 


In the process of solving Eq. 19, the independent joint accelerations and velocities are integrated 
to obtain the independent joint velocities and coordinates. Then Eq. 16 (or Eq. 12) is used to find the 
dependent joint velocities. However, prior to that the dependent joint coordinates need to be 
computed. In order to find the dependent joint coordinates, the constraints of Eq. 11 must be solved 
for each closed-loop. These constraints are nonlinear in 0 (or q) and they are not available 
explicitly. However, an iterative formula for finding the dependent joint coordinates can be derived. 
By applying the Newton-Raphson method to Eq. 11, the iterative formula is found as 


C A0 - - ♦* (26) 

where A0 denotes the corrections in vector 0 and ♦* denotes the violation in the constraints. Note 
that the violation in the constraints of Eq. 11 is the same as the violation in the constraints written in 
terms of the absolute coordinates of the bodies common to the cut joint(s). For the sake of simplicity, 
the iteration formula of Eq. 26 is shown in its abstract form. We assume it is understood that in this 
equation the known (independent) elements of 0 and their corresponding columns of C have been 
manipulated properly. Furthermore, if there is more than one closed kinematic loop in the system, this 
iterative process can be applied separately to the constraint equations of each loop. The important 
point to draw from Eq. 26 is that explicit expressions for the joint coordinate constraints, represented 
by Eq. 11, are not needed. 


3. COMPUTATIONAL CONSIDERATIONS 

For most multibody systems, the inertia matrix M and the vector g containing the external 
forces/moments and the gyroscopic terms can be constructed systematically [7], A systematic 
construction of the velocity transformation matrix B, for open or reduced open loop systems, has been 
shown in [4]. This matrix is constructed from the topology of the system by assembling smaller block 
matrices representing different type of kinematic joints. Since majrix B can be constructed in 
explicit form in terms of the absolute coordinates q, then matrix B can also be determined and 
expressed in explicit form as a function of q and v. For multibody systems containing closed 
kinematic loops, matrix C for Eqs. 11-14 and matrix E for Eq. 18 must be constructed. The process 
outlined in the preceding sections will be demonstrated by a simple example. 

Example: Consider the multibody system shown in Fig. 4(a), containing four moving bodies. Body 1 
is connected to the ground by a prismatic joint Tj , and there are four revolute joints, R s through R s , 
with parallel axes connecting the bodies in a closed-loop. If the closed-loop is cut at R„ the reduced 
system of Fig. 4(b) is obtained. For the reduced system, four joint coordinates, 6j , 0, , 0, , and 0* 
are defined, where 0, represents relative translation between body 1 and the ground, and the other 
three joint coordinates represent relative rotations between the adjacent bodies. Four unit vectors, 
u t through u % , are defined along the four joint axes. The vector of absolute velocities v is a 
24-vector, where the vector of relative velocities 0 is a 4-vector. The velocity transformation 
matrix for this reduced system is: 
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Figure 4 (a) A multibody system with four moving bodies containing a closed-loop* 
(b) Its reduced open-loop representation* 
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symmetric matrix made of the components of a 3-vector a, and ab represents the cross product of two 
vectors a and b. The structure of B shows that the matrix is constructed from 6x1 block matrices 


[ml 

0 r re P resen *' n S a P r ' smat ' c joint along the unit vector u lt and 6x1 block matrices 
representing revolute joints along unit vectors Uj, i = 2, 3, and 4 [4]. 


-3ijUi 

us 


The constraint equations for the cut revolute joint R 5 , i*e., •*, between bodies 1 and 4 in terms 
of the absolute coordinates of these bodies can be expressed as [7]: 


* 3 = *i + - r, - s, = 0 


• 2 = r^n* = 0 

where •* represents three algebraic equations stating that two points Pi and P* on the joint axis must 
coincide, and ♦* states that two unit vectors t\i and n,, along the axis of R* must remain parallel* It 
should be noted that the cross product of two vectors yields three abgebraic constraints, and only two 
out of three are independent* The time derivative of these constraints yields [7): 
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n^ui - nin^Mn = 0 


or, 


I “*i 0 0 0 0 -1 
0 n k H l 0 0 0 0 0 




Therefore, the coefficients of the velocity components provide the Jacobian matrix for this cut 
re volute joint as: 


D 




I 0 0 0 0 -1 

0 n„rii 0 0 0 0 0 


-n i n » 


(5x24) 


Note that this is a 5x24 matrix. The product D*B is found to be 


o (3 2 «. + s.,)u 1 (3,,,+sOu, 
0 -ri^^Us 




niSnii* 


-1(5x4) 


which is a 5x4 matrix since the columns of the matrix correspond to the four joint coordinates* Note 
that the first column of the matrix is all zeros, and this column corresponds to which is not in the 
closed-loop* Based on the initial assumption, the four revolute joint axes are parallel, therefore, the 
cross product n H u 2 = n^Uj = n^u* = 0. This means that the last two rows of the 5x4 matrix are zeros 
and they can be eliminated from the matrix. This leaves a 3x4 matrix as: 


D B — [0 (3 2 i, + 5 % )U2 (3 3 *+5*)U 3 (3** + S*)U*]^ x 4 ) 

If for a given configuration numerical values are substituted for the components of the vectors in this 
matrix, it will be found that the three rows are not independent — one row is redundant and can be 
eliminated. For example, for a particular configuration, the numerical values of the elements of this 
matrix may be: 


0 0 -1.4142 

0 0 -1.4142 

0 -1.4142 -1.4142 


-1.4142 

-1.4142 

0 


J (3x4) 


(a) 


This result should have been expected, since the closed-loop is a four-bar mechanism with one relative 
degree of freedom between its four bodies. Since there are three joint coordinates associated with 
this four-bar mechanism, there must be only two independent constraints between them. Therefore, 
matrix D # B of Eq. (a) becomes 



0 0 -1.4142 -1.4142 

0 -1.4142 -1.4142 0 

Since matrix C is available in explicit form, its time derivative and hence the product can be found 
for Eq. 13. 

After elimination of the redundant row matrix C can be written in abstract form as 



(b) 


0 C 1 C 2 C 3 
0 C H C 5 c 6 


where c x , c 6 are known explicitly from Eq. (a). If we choose 0 2 as the independent joint 
coordinate for the closed-loop, and noting that 0! is independent from the loop, we can have 

0i 

e, 
e, 

where 


CjCi* " CjCg C1C5 - C 2 C* 

— , and e 2 = 

C 2 C 6 - C3C5 4 c 2 c 6 - C3C5 

If the numerical values of the elements of C from Eq. (b) are used, matrix E will be found to be 


1 0 
0 1 



1 0 
0 1 


4. RECURSIVE VERSUS NONRECURSIVE ALGORITHMS 


The equations of motion for open or closed loop systems (Eq. 8 or Eqs. 11-14) can be generated and 
solved for the accelerations either recursively or nonrecursively. Since the inertia matrix M is 
symmetric, a standard nonrecursive technique such as L^DL factorization can be employed to solve for 
the unknown accelerations. An alternative recursive approach for finding the unknown accelerations 
was first developed by Featherstone [8]. The idea is to remove one body from a multibody system and 
then to consider the remaining system as a new multibody system. Articulated body inertias are the 
properties which make the remaining system act as the original one. The articulated inertias are 
calculated by projecting the mass and inertia of the removed body onto the remaining system. 
Repeatedly removing one body from the multibody system leads to a system with only one body for 
which the accelerations can easily be calculated. The accelerations of the removed bodies will then 
be obtained by back substitution. Wehage interpreted this process mathematically by using a large 
number of equations of motion [9-11]. The unknowns of the equations are the absolute and joint 
accelerations, as well as joint reaction forces. The equations of motion are solved by applying matrix 
partitioning. This more theoretical approach allows for a very general formulation of the recursive 
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projection algorithm. Wehage shows that a recursive algorithm is equivalent to a "block LU 
factorization." 

For an open loop system containing n joints, a nonrecursive matrix factorization algorithm requires 
a CPU time of approximately Order(n 2 ). For the same system, a recursive algorithm requires a CPU 
time of Order(n). However, the factor in front of 0(n) or 0(n 2 ) can make one algorithm more or less 
efficient than the other, depending upon n. Therefore, some examples are shown in this section to 
clarify this concept. 

In the recursive algorithms, the velocity transformation matrix B can be represented as the 
product of two matrices [ 9 J, 

B = G _1 H (27) 

The matrices G and H are found from velocity transformation equations between consecutive bodies as 

v. « G.v. + H.6. (28) 

J J ' J I 

The equations of motion for an open loop system; i.e., Eq. 8, are then written as 

(G" 1 H) T IIG" 1 Hl = (G -1 H) T (g - MG _1 t) (29) 

where y contains quadratic velocity terms which can be constructed from 

Y. = 6,v. + H. 0. (^0) 

J I * J J 

A detailed description of the recursive algorithm used in this study can be found in [12]. 

Two simple examples are considered here for comparison of the CPU times. Figure 5 shows a 
highly parallel and a highly serial system. In both systems, the number of joints n is increased, 
starting from one, between simulations. For the parallel system with only revolute joints, the 
nonrecursive method is more efficient than the' recursive method regardless of number of joints, as 
shown in Fig. 6(a). It can be observed that both algorithms yield CPU times that increase almost 
linearly as n is increased. For the serial system, however, there is a breakpoint beyond which the 
recursive algorithm becomes more efficient than the nonrecursive algorithm. As shown in Fig. 6(b, c, 
d), the breakpoint for the number of joints n is six, nine, and five when the serial system contains only 
revolute joints, prismatic joints, or spherical joints respectively. 




Figure 5 Two systems with (a) a highly parallel and (b) a highly serial structures. 
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Figure 6 CPU time for one function evaluation versus number of joints for (a) a highly parallel system, 
and (b), (c), (d) highly serial systems containing only revolute, prismatic, and spherical joints 
respectively. 

In reference [11], the recursive projection algorithm of open loop systems is modified for systems 
containing closed kinematic loops. This algorithm has been tested and the result is reported in [12]. 
It is shown that since a closed loop is cut at one of the joints to form a reduced open loop system, the 
breakpoint for the number of joints in a closed loop is approximately double of that of an open loop 
system. For example, if the closed loop contains only revolute joints, there should be approximately 
twelve or more bodies in the loop for the recursive algorithm to exhibit more efficiency than the 
nonrecursive method. 

For mechanical systems with only rigid bodies, it is rather unlikely to have open or closed loops 
with enough number of bodies to make a recursive algorithm more efficient than a nonrecursive one. 
However, when one or more of the bodies in a system are considered as deformable, then the concept 
of recursive projection technique becomes highly attractive. 

5. PARALLEL COMPUTATIONAL CONSIDERATIONS 

When computation on a multiple-instruction multiple-data (MIMD) multiprocessing system is 
considered, obvious parallelisms arising from the topology (e.g. multiple branches) can be exploited. 
However, a true measure of the suitability of a computational scheme for parallel processing is the 
degree of intrinsic parallelism in the scheme for the worst case (i.e. single branch open-loop linkage). 
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The formulation described by Eq. 8 was applied to a 6 degree -of -freedom Stanford arm that 
consists of a base body plus 6 links all of which are connected to each other serially by revolute joints 
except link 3 which is connected to link 2 by a prismatic joint. The algorithm to compute • at each 
time step was represented as a data-flow graph. It is known that the shortest possible time to 
traverse the graph from its beginning to its end is the length of the longest possible path in the graph, 
or the critical path. 

The maximum speedup for any multiprocessing system is, therefore, the ratio of the serial 
computation time to the time corresponding to the critical path of the data-flow graph, since the 
latter is a property of the computational scheme alone. The length of the critical path was computed 
for a Stanford arm possessing 1 through 6 degrees of freedom. A plot of the maximum speedup with 
the degrees of freedom in the Stanford arm is shown in Fig. 7. The plot indicates the possibility of 
large speedups (11 to 76) if a suitable multiprocessing system and a proper scheduling algorithm are 
used. It is also observed that the length of the critical path of the data-flow graph resulting from the 
formulation increases linearly as the number of degrees of freedom increases in a serially connected 
multibody system. This suggests that the maximum speedup will approach a constant (ratio of the rate 
of increase of serial computation time to the rate of increase of critical path length) for a large 
number of degrees of freedom. 



Figure 7 Maximum speedup versus number of degrees of freedom for the Stanford arm. 

The velocity transformation of Eq. 5 offers other possibilities for parallel processing. It is 
evident from the approach described above that if the coordinate set, 0, used to describe the 
configuration of the system is such that the critical path of the data-flow graph does not increase or 
increases very slowly as new degrees of freedom are added to the system, then the maximum speedup 
will increase linearly or approach a very high limit at large numbers of degrees of freedom. 
Therefore, if matrix B can be chosen such that the resulting 0 is this type of a coordinate set, then 
the formulation would be very suitable for parallel processing because of very large potential gains in 
speedup. Work is currently under way to determine such matrices B automatically on the basis of the 
topology imposed by each type of joint in a multibody system. 

6. DEFORMABLE BODIES 

In the dynamic analysis of multibody systems, the elastodynamic effects may play an important role on 
the behavior of the system. The most popular technique for describing the flexibility of the 
components of a system is the finite element method. In standard finite element formulation, the 
gross motion (large displacements, large deformations) is not taken into account. However, in order 
to analyze flexible multibody systems, such phenomena must be considered. Several researchers have 
suggested procedures that successfully introduce elastodynamic effects into multibody dynamics 
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equations [ 1 3 -1 5 )• The main problem with the inclusion of elastodynamic effects is that the flexible 
bodies may have a relatively complex geometry. This implies that a large number of nodes may be 
necessary and, therefore, a system of equations with a large number of degrees of freedom will result. 
In order to gain computational efficiency, a formulation based on the modal superposition method has 
been suggested to reduce the number of degrees of freedom of the model [14]. 

The method of joint coordinates for multi-rigidbody dynamics can be extended to a system of 
mixed rigid and flexible bodies. The equations of motion for the rigid bodies are written in terms of 
the absolute coordinates, similar to Eqs. 1-4, and the equations of motion for the flexible bodies are 
written in term of either nodal or modal coordinates. Additional kinematic constraints may be 
necessary to represent the connectivities between rigid-to-f lexible and f lexible-to-f lexible bodies. 
Then, a velocity transformation process, similar to that of rigid bodies. (Eqs. 5-10) can be applied to 
remove the algebraic constraints and their corresponding Lagrange multipliers. The resultant 
equations of motion for an open loop system can be converted to a minimal set of differential 
equations. In the case of closed loop systems, the equations of motion may or may not contain 
algebraic constraints and Lagrange multipliers. 

For systems containing flexible bodies with linear elastic material properties, the equations of 
motion with modal coordinates can be used. However, in some applications, it may be necessary to 
consider the equations of motion in terms of the nodal coordinates in order to obtain more accurate 
results. 

7. CONCLUSION 

The equations of motion for multibody systems containing closed kinematic loops can be written either 
as a set of differential-algebraic equations (Eq. 11-14) or as a set of ordinary differential equations 
(Eq. 19). The elements of the constrained equations of motion given by Eqs. 11-14 can be constructed 
efficiently. Although all of the joint coordinates are not independent of each other, and hence the 
number of integration variables is not a minimum, the numerical integration of these equations can be 
performed efficiently. For example, a dynamic simulation of the system shown in Fig. 4, including 
several force elements, was performed using two different formulations -- the absolute coordinate 
formulation of Eqs. 1-4 and the joint coordinate formulation of Eqs. 11-14. The numerical integration 
of the equations of motion in both cases was carried out using a predictor-corrector Adams-Bashforth 
algorithm on a desktop workstation. The CPU time for simulating ten seconds of dynamic response 
was 352 seconds for Eqs. 1-4 and 75 seconds for Eqs. 11-14. The results obtained from these and 
other simulations have shown that the formulation of Eqs. 11-14 yields about five times or more 
efficiency over the formulation of Eqs. 1-4. The degree of efficiency depends on the number of 
bodies, number of joints, and the connectivity between the bodies. 

Equation 19 provides the minimum number of equations of motion for a multibody system 
containing closed kinematic loops. The number of equations and the number of integration variables 
are smaller when compared to those of Eqs. 11-14. Therefore, it may appear that the numerical 
solution of Eq. 19 to be more efficient than that of Eqs. 11-14. However, a careful examination of 
the elements of Eq. 19 would reveal that the overhead associated with evaluating these elements may 
be more than the overhead associated with the additional number of equations and integration 
variables of Eqs. 11-14. Numerical simulations of several problems using the two methods have shown 
that the computation time associated with these two formulations are about the same. 

Systematic generation of the elements of the equations of motion with the joint coordinates 
makes the formulation ideal for symbolic generation of these elements. Computer programs have been 
developed that symbolically generate the equations of motion for rigid body systems. The equations 
are generated in an optimized fashion to improve the computational efficiency of the dynamic 
simulation. The programs dealing with rigid and flexible bodies evaluate the equations of motion 
numerically. The equations of motion for deformable bodies can be considered either in terms of the 
modal or the nodal coordinates. 
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Another interesting feature of these equations is that the process of solving the equations of 
motion for unknown accelerations can be performed either recursively or nonrecursively. It has been 
shown that for highly serial systems with long chains, a recursive process may yield computational 
efficiency. Further adaptation of these equations to multiprocessor computers results in a highly 
efficient simulation package. 
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Abstract 

The equations of motion of a multibody system are nonlinear in nature, and thus pose a 
difficult problem in linear control design. One approach is to have a first-order 
approximation through the numerical perturbations at a given configuration, and to design a 
control law based on the linearized model. In this paper, a linearized model is generated 
analytically by following the footsteps of the recursive derivation of the equations of motion. 

The equations of motion are first written in a Newton-Euler form, which is systematic 
and easy to construct; then, they are transformed into a relative coordinate representation, 
which is more efficient in computation. A new computational method for linearization is 
obtained by applying a series of first-order analytical approximations to the recursive 
kinematic relationships. The method has proved to be computationally more efficient because 
of its recursive nature. It has also turned out to be more accurate because of the fact that 
analytical perturbation circumvents numerical differentiation and other associated numerical 
operations that may accumulate computational error, thus requiring only analytical 
operations of matrices and vectors. 

The power of the proposed linearization algorithm is demonstrated, in comparison to a 
numerical perturbation method, with a two-link manipulator and a seven degrees of freedom 
robotic manipulator. Its application to control design is also demonstrated. 

1. Introduction 

The behavior of a nonlinear dynamic system can be approximated by a linearized model 
in the neighborhood of a reference configuration. Intuitively, linear models of dynamic 
systems can be obtained by simply omitting nonlinear effects of the nonlinear dynamic 
systems, such as Coriolis forces, centrifugal forces, and the interaction forces between bodies. 
Such a model, however, cannot satisfy the needs of computer-aided design of control systems 
for multibody systems because the intuitive simplification is usually case-dependent. 
Therefore, a better linearized dynamic model based on a general purpose dynamics model is 
necessary. The approach that fits this requirement most is first-order approximations of the 
nonlinear dynamic models, which yield valid results in the neighborhood of the reference 
configuration for dynamic and control analysis. A straightforward approach to obtain first- 
order approximations of multibody systems is first to generate the analytical closed-form, 
nonlinear equations of motion of the systems, and then to generate the linearized equations of 
motion using first-order Taylor expansions. Unfortunately, these analytical equations of 
motion are generally not available because they are too complex to be generated. 

Due to the difficulty of analytically generating the closed-form, nonlinear equations of 
motion of a multibody system, a numerical perturbation method is usually applied to obtain a 
linearized model at a certain configuration. For instance, in DISCOS [1] the numerical 
perturbation method is employed to generate a linearized model for stability analysis of a 
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multibody system at a selected configuration. Following a similar idea, Liang [2] implemented 
a numerical perturbation method with DADS [3] for multibody mechanical system control. 
Moreover, the numerical perturbation method has been widely implemented in dynamic and 
control analysis. For example, Sohoni and Whitesell [4] applied it in ADAMS, and Singh, 

Likins, and Vendervoortt applied [5] it to generate linear models of flexible body systems. 

In the implementation of the numerical perturbation method, iterative computations 
are employed to ensure that the resulting linearized models are accurate. However, the 
iterative computation sometimes may not generate a satisfactory linearized model because of 
failure in convergence. Therefore, a trade-off between the accuracy and convergence must be 
made to generate a useful linearized model. An accurate linearized model is difficult to be 
generated with good computational efficiency when the numerical perturbation approaches are 
applied. In resolving this problem, the numerical perturbation method must be avoided during 
the linearization procedure. 

On the other hand, symbolic programming languages can be used to devise efficient 
computational techniques to obtain the linearized manipulator models. Vukobratovic and Nenad 
[6] proposed the linearization technique that first generates the nonlinear dynamic models of 
the manipulator by means of symbolic programming languages, and then takes first-order 
approximation from the given nonlinear model. Following the same approach, Neuman and 
Murray [7] linearized symbolically the Lagrangian dynamic robot model about a nominal 
trajectory to generate the linearized and trajectory sensitivity models of a manipulator. 
Balafoutis, Misra, and Patel [8] further extended Neuman's approach to obtain more 
computational efficiency in generating linearized models by using the fact that the derivatives 
of trigonometric functions need not be computed explicitly, and that the partial derivatives of 
the homogeneous transformation matrices may be obtained merely by row and column 
manipulations. The same idea was applied to generate linearized models for flexible multibody 
systems by Jonker [9]. Thus, although this approach has the advantage of not using the 
numerical perturbation method, there is at the same time a disadvantage: it relies heavily on 
symbolic programming languages. Consequently, the approach is restricted to special case 
studies only until a general purpose symbolic manipulation package for the dynamic modelling 
becomes available. 

In searching for a general purpose computer-aided dynamic analysis algorithm, Bae 
and Haug [10,11,12] developed a recursive formulation, which was later improved by Bae, 
Hwang and Haug [13,14]. In this approach, the equations of motion are first written in a 
Newton-Euler form, which is systematic and easy to construct. They are then transformed 
into a relative coordinate representation, which is efficient for computation. This approach is 
extended in this paper to efficiently generate a linearized model using the recursive 
computational structure and applying the analytical linear approximations of the recursive 
kinematic relationships, without applying numerical perturbations. The computational 
efficiency and opportunity for parallelism of the recursive algorithm would make it possible 
to linearize successively for adaptive dynamics control. 

An analytical linearization algorithm is derived by using the recursive variational 
derivation, and by linearizing kinematic relationships analytically. In the recursive 
formulation, the equations of motion are obtained through a series of coordinate 
transformations. By analytically taking first-order approximations of kinematic 
relationships between Cartesian, state vector, and joint variables and then applying these 
linearized relationships in the recursive variational derivation, linearized equations of motion 
are generated in joint space. The proposed linearization algorithm is shown in Fig. 1 and is 
explained as follows: 

( 1 ) Variational equations of motion are obtained in Cartesian space and the generalized mass 
and force are approximated with first-order Taylor expansions. 

( 2 ) First-order approximate kinematic relationships are obtained between Cartesian 
variables and state vector variables [14] and are substituted into the variational 
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equations obtained in (1). Linearized variational equations in state vector space are 
generated. 

(3) First-order Taylor expansions for the kinematical relationships between state vector 
variables and joint variables are obtained, and then substituted into the approximate 
variational equations obtained in (2). 

( 4 ) Linearized equations of motion are obtained from the approximate variational equations 
in joint space. At this stage, open-chain mechanisms are expressed in terms of 
independent coordinates and closed-chain mechanisms are expressed in a mixed 
differential algebraic equation (DAE) form. 

( 5 ) Linearized equations of motion expressed only in terms of independent coordinates are 
written in state space form for control applications. 

The rest of the paper is organized as following. In Section 2, linearized kinematic 
relationships are expressed in terms of the generalized state vector, which is used to simplify 
expressions and to obtain compact equations. In Section 3, linearized relative kinematics 
relations are derived for two contiguous bodies. The linearized equations of motion are 
developed in Section 4. In Section 5, numerical examples of the recursive linearization 
method are given. In addition, control designs based on the linearized models and linear control 
theory are demonstrated. Finally, conclusions are presented in Section 6. 


2. Generalized State Vector Notation 

In this section, a first-order Taylor expansion is applied to approximate the 
relationship between Cartesian variables and generalized velocity state variables. The 
generalized velocity state vector, called the velocity state, is used to simplify expressions in 
later derivations. It is defined as [13] 


Yp = 


f p + r p fflp 
On 


(1 ) 


where the subscript P represents the origin of a body-fixed frame, as shown in Fig. 2. 
Cartesian velocity of point P can be written as 


The 


Y P = 


(Or 


( 2 ) 


where r p and co p are the translational velocity of point P and the angular velocity of a body- 
fixed frame at point P, respectively. 

From the velocity expressions in Eqs. 1 and 2, the Cartesian velocity Y p is expressed as 



-TpYp (3) 

Replacing r p by the virtual displacement Srp and a>p by the virtual rotation 5np, yield 
the variation of the position state vector. 

5Z P s T P 8Z P ( 4 ) 

The Cartesian acceleration of the body-fixed frame shown in Fig. 2 is defined as the time 
derivative of Eq. 3, 

Yp=TpYp-Vp (5) 
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where V r 


r p<D p 


is a velocity coupling vector. 


L o -I 

Since the right sides of Eqs. 1 to 5 are explicitly expressed in terms of Cartesian 
variables, their first-order expansions can be obtained analytically with respect to 
perturbations in the Cartesian variables. 

First, by expanding Tp at a reference configuration, Eq. 4 can be represented as 

5Z P = ( T p + d Tp + 0(A) 2 ) SZp ( 6 ) 

where d denotes a first-order perturbation with respect to Cartesian variables; the 
superscript o signif 


dtp = 


dT P 

dZ P 


es that the quantity is evaluated at a reference configuration, i.e., 
A 


J evaluated at o 

and A denotes perturbed quantity, which is expressed in terms of the variables in the 
Cartesian space. The perturbation of matrix T p can be obtained as 


dTp 


0 - df p 

0 0 

where the partial derivative of the position vector of point P can be expressed as 


(7) 


dr p _ dr p 


fp d* p 


( 8 ) 


Similarly, the acceleration relationship, Eq. 5, can be expanded as 


Yp - { Tp + d Tp + 0(A)*] Yp - [ Vp + d Vp + O(A)*] ( 9 ) 

The derivatives of the position state variable and the Cartesian variable can be related 
from Eq. 6 as 

dZp = Tp dZp (10) 

and the derivative of the Cartesian velocity vector can be written as 

dY P = d ( T P Y P ) 


= T P d Y P + dT P Y P 


where 


dTp Y P = 


(Dp 

L o 


-<Dp r p 


dZp 


( 11 ) 


( 12 ) 


where d Y p and d Z p are the perturbations of velocity and position state vectors. Based on the 

relationships in Eqs. 9, 1 0, and 11, the derivative of the variables in the Cartesian space can 
be expressed in terms of the derivatives of the state variables. 


3. Relative Kinematics of Two Contiguous Bodies 

In this section, a first-order Taylor approximation is derived to represent relative 
kinematic relationships between contiguous bodies that are constrained by a kinematic joint, 
as shown in Fig. 3. The relative kinematic relation between the velocities of the two 
contiguous bodies i and j is defined as [13] 

Y..v + Ba (13) 

Yj - Yj + D|j q,j 

where 
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Bi 


dd 


3qi 


+ (f j + Sji)^ ij 
H ij 


Relationships between virtual displacements and rotations of these bodies a re ^obtained I y 
replacing the velocity vectors in Eq. 13 with virtual translational and rotational vectors, i.e., 

(14) 

5Zj = 8Zj + Bjj 5qjj 

Taking the time derivative of Eq. 13 yields the acceleration relationship 


Y; = Yj + Bjj qy + D 


(15) 


where 


As shown in Eqs. 13, 14, and 15, the state variables of body j are expressed in terms 
of the state variable of body i and the joint variables used to define the relative motion bet ^® en 
bodies i and j. The first-order Taylor expansions of Eqs. 14 and 15 with respect to the state 
variable of body i and the joint variables can be expressed as 

( 16 ) 


5Z; = 8Zj + [ B 


Yj = Y| 


9 j + dB 


') 


0(A)2] 5qjj 


[ B fj + dB?j + 0(A) 2 ] q ij+ [ 


Djj + dDy 


O(A) 2 ] 


(17) 


where cJBjj and dDjj 


are computed in terms of the state variables of body i and the joint 

variables. 

Linearized equations of motion are generated based on the linearized joint kinematic 
relationships and linearized relationships between the state space variables of bodies i and j 
From Eqs. 13 and 14, relations between the perturbations of state variables and relative 

coordinates are expressed as 

- (18) 
dZj = dZj + Bjj dqjj 


dY 


dY 


= dY j + dBjj qy + Bjj dqjj 


= dYi 


+ dBy qy + 


By dqy + dD 


>j 


( 19 ) 


( 20 ) 


4. Linearized Equations of Motion of a Tree Structural Mechanism 

The linearized equations of motion of a tree structure mechanism that contains n joints 
and n+1 bodies, as shown in Fig. 4, are presented in this section. By going through the 
procedure of variational derivation [13] and by replacing the nonlinear kinematic 
relationships with their first-order approximations, one can generate the linearized 
equations of motion for an open-chain system. Applying the linearized kinematic 
relationships to the recursive variational approach yields the linearized equations of motion 
that are written in terms of the joint variables. 

41 Variational Equatio ns in Cartesian Space 

The variational form of the Newton-Euler equations of motion for an n-body system is 

written as [3] 

0 = ^SZ T j (MjYj - Qj ) (21 ) 

i = 0 
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where Mj is the mass matrix and Qj is the generalized force. The variational equation must 

hold for all kinematically admissible variations 8Zj, i = 1,...,n; i.e., the kinematic constraints 

on the system must be satisfied by 8Zj. Then approximate variational equations can be 

generated from a set of approximations of the generalized mass and force for each body, which 
are expressed as 


Mj = M° + dM? + 0(A) 2 (2 2) 

Qj = Q? + dQ^ + 0(A) 2 (23) 

where dMj and dQj are expressed in terms of Cartesian variables. 

4,2 Variational Equation s in State Vector Space 

Approximated variational equations in state vector space can be obtained by 
substituting Eqs. 6, 10, 22, and 26 into the equations of motion in Cartesian space and by 
replacing Cartesian variables with the state variables. The approximated variational 
equations can be written as 


f I 

= X8Z{{[Tl°M^Yj-Tf(M^V? + Q9] + [(dTfM^TY ^IdM^T? + 

i = o 

TfM^Yj -(dT^M^Vf +TjdM < jV < j ) +t} < M°V^ > +dT{Q° + TjdQ?)] + 0(A) 2 } 


where Vj is a velocity coupling term, which is defined in Eq. 5. In order to simplify Eq. 24, 


the notation of the generalized mass matrix Mj and force vector Qj in the state vector space 

[13] will be used henceforth. The equations of motion are thus expressed as 
n 

0 = X 8Z{ {( M^Yj - Oft + d M^Yj - d6f + 0(A) 2 } (25) 

i = o 


where 


dMj = dTjMjTj +T{dMjTj +T{MjdTj 

dQj = dTjMjVj +T|dMjVj -t-TjMjdVj +dT{Qj +T|dQj 
and dMj, dTj, dVj and dQj, which are expressed in terms of the perturbations of the Cartesian 
variables, can be rewritten in terms of the perturbations of the state variables by 
substituting the relationship between the Cartesian variables and the state variables into 
their expressions. 

4,3, Linearized Equations of Motio n in Joint Space 

The approximated variational equations of motion in state vector space can be rewritten 
in terms of joint variables. As the results in Section 3 indicate, the variables in state vector 
space can be transformed into joint variables. The linearized equations in joint space can be 
obtained by applying the following procedures. Substituting the approximated kinematic 
relationship between bodies n and n-1 into the variational equations yields 
n-1 

0 = X {( M?Yj - Oft + dM°Yj - dC? + 0(A) 2 } + 
i = o 


5Z 


T 

n-1 


{( 


M?Y n 


Oft 


dM°Y n 


- dOfj + 0(A) 2 } + 


(26) 
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{ (Bp 


dBp 


0(A) 2 ) [ ( M°Y n - Off) + dMpY n - d6ff + 0(A) 2 ]} 

where the perturbations are taken with respect to the state variables of the inboard bodies 
and the relative variables that are used to defined the relative motion between the bodies n and 
n-1. Moreover, the relative kinematic matrix B n _i n is denoted as B n to simplify the 


expressions during the derivation. 

Because joint n is not subject to any relative constraint between the connected bodies, 
the virtual displacement of the joint coordinate is arbitrary. Thus the coefficient of Sqj, is 
equal to zero; i.e.. 


0 = (sf + dBf 


0(A) / [ ( 


■&n*n 


Off) + dM°Y n - d6ff + 0(A)‘ 


(27) 


Substituting the first-order Taylor expansion of the acceleration vector into state vector 
space, and substituting the relationships between state and joint spaces into Eq. 27, gives the 
equation of motion corresponding to joint n. 


where 


0 = B„ T ( M°Y° - eft + d (B n T M°Y») -d(B J Cfi ) ♦ 0(a) 2 ) 


(28) 


d (BpM n Y n ) . dBj,M n Y n ♦ BldM n Y n + Bj,M„dY n 

KB^n) -dBX'Bldq, 


A A 

and dM n anddQ n can be written in terms of the state variables of body n-1 and the relative 
variables that are used to define the relative motion between bodies n and n-1. Moreover, the 
equation of motion corresponding to joint n at the reference configuration is 



Substituting the relationship in Eq. 29 into Eq. 28 and omitting higher order terms yields 


0 


d(BjM°Y°) -d(Bjoff) 


(30) 


where Y n can be expressed in terms of the derivative of a set of independent variables 

x ( = [ Yq Yq Zq qj qj qj ... qj qp] T ) by substituting the relationships from Eqs. 
18, 19, and 20 recursively for j from n to 1. 

Following the same arguments used in obtaining Eq. 30, one can obtain the linearized 
equations of motion corresponding to joint i as 

0 = B T j d( KjYj + Kj + -|Bj + i qj + i+ ... + K n B n q n - Lj)° + 

dBf ( KjYj + Kj + i B i+ i q i+1 + ... + K n B n q n - Lj) 0 

for i = 1 n ( 3 1 ) 

where Kj = M j + Kj + ^ , Lj = Qj + Lj + ^ - Kj + ^ Dj + i , K n = M n , and L n = Q n . 

By repeating the above procedure in backward path sequence to the base body, one can 
obtain the linearized equation of motion for the base body as 

d( K 0 Y 0 + K -j B -| q i -t- ... + K n B n q n - L 0 )° =0 ( 32 ) 
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For the open-chain mechanism, the linearized equations of motion at the reference 
configuration, which are represented in Eqs. 31 and 32, can be obtained recursively. During 
the derivation, every perturbed term in the linearized equations can be computed either from 
the perturbed variables in the Cartesian variables, which are computed analytically, or from 
the analytically linearized relationships among the Cartesian, state vector, and joint 
coordinate spaces. 


5. Numerical Examples 

In this section, the applications of the recursive linearization algorithm are illustrated 
by two examples: a two-link manipulator and a robot arm with seven degrees of freedom. The 
accuracy and computational efficiency of the proposed algorithm are demonstrated by 
comparing the models obtained from the recursive algorithm with those obtained from the 
analytical approach and from a numerical perturbation method. In the case of the two-link 
system, an exact linearization is accomplished by the use of the symbolic manipulator 
(MACSYMA) [15]. However, to generate the exact linearized model for a complicated system 
is very difficult, even with a symbolic manipulator. In the second example, a numerical 
perturbation is applied to the robot with seven degrees of freedom in order to generate a 
reference linearized model with which the results of the recursive linearization algorithm are 
compared. 

5.1 A Two-Link Manipulator 

In this subsection, a two-link manipulator, as shown in Fig. 7, is modeled and tested. 
Since all the joints are revolute, one independent coordinate is assigned to each joint. The 
manipulator can be modeled as a system of two differential equations. For this system, the 
linearization can also be carried out analytically by using the symbolic manipulator 
(MACSYMA). Therefore, it is possible to check the accuracy of the recursive linearization 
algorithm by comparing the linear models obtained from both approaches: recursive 
linearization and MACSYMA implementation. 

The recursive linearization produced a linearized model at the specified configuration 


that was defined by setting 0 1t e 2 , 0-,, and 0 2 to zero. The linearized equation of motion is 


written as 


1 

cl a 
JP rS* 

l 


1 

O O 

o o 

o -»• 
o 

d0 2 

dd 1 

1- 



-5.191 3.4612 0 0 

- 1 .1537 -4.0380 0 0 -1 

1 

cl a. 
JP rS>* 
1 

+ 

-dfy- 




0 

0 

-0.0824 

0.0294 


0 1 

0 rdT 2 ] 

0.0294 LdTj 
-0.01 76-1 


( 33 ) 

where T-) and T 2 are actuating torques that are applied at the revolute joints. At the same 
specified configuration, the symbolic manipulation generated the exact linearized model, which 
is identical to the one obtained from the recursive linearization approach. A comparison of the 
numerical and analytical results shows that the proposed recursive linearization algorithm 
can generate a correct linearized model at a given configuration. 

After the linearized model is obtained, a linear controller can be designed by applying 
the linear model to existing control design tools. A linear regulator is designed to control the 
motion of the manipulator by using the Pro-Matlab package. The pole placement algorithm 
[16] is used to compute the full state feedback gain matrix for the nonlinear dynamic model. 
The effectiveness of this regulator is tested by applying an initial deviation of the system and 
using the regulator to stabilize it. As expected, the linearized model can well represent the 
nonlinear model. Therefore, a small initial deviation i6 tested first. The results of 0.05 
radian initial deviations are shown in Figs. 6 and 7. The nonlinear system can be stabilized by 
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the linear regulator. Similar results are presented in Figs. 8 and 9 for 1.0 radian initial 
deviations. 

However, the pole placement algorithm for a multiple-input multiple-output system 
does not have a unique solution [17]. The feedback gain obtained from the Pro-Matlab package 
is an iterating solution, which is designed to find an insensitive set for the configuration 
change. However, this algorithm requires a lot of computation to generate an optimal gain 
matrix. Thus, this algorithm cannot be used for an on-line computation for the real time 
simulation. To fulfill the on-line computation requirement, a simple and stable pole 
placement algorithm is needed. A case particularly interesting is to determine the feedback 
controller [1 7] in such a way that the closed loop equation is decomposed into a set of n 
decoupled second-order differential equations. 

0 = d0j + 2^ o)j d0j + to? d0j ; i=1 n (34) 

where the damping factor ^ and the undamped frequency tOj of each tracking error are 
specified by the designer. Defining the nxn constant diagonal matrices A^ = diagj{2^j CDj } and 
A 2 = diagj{ co? }, one can obtain the desired decoupled closed loop equation from Eq. 34 as 

0 = d0 + A-| d0 + A 2 d0 (35) 

The closed loop equation of the linearized model with a proportional-derivative (PD) 
controller can be written in a second order differential equation form as 

0 = M d0 + ( P r K v )d0 +( P 2 - Kp) d0 (36) 

where Kp is the position feedback gain matrix and Ky is the velocity feedback gain matrix. 
Equating coefficients in Eqs. 35 and 36 gives the desired closed-loop feedback gain matrices as 

K v = MAt - P, (37) 

Kp = MA 2 - P 2 (3 8) 

Consequently, a linear regulator is designed to control the dynamic system. As shown in Figs. 
10, 11, 12, and 13, the linear regulator can stabilize the nonlinear dynamic model for both 
small and large initial deviation cases. 

5.2 A Robot with Seven Degrees of Freedom 

Figure 1 0 shows a robot arm that has seven degrees of freedom. The system consists of 
eight bodies, including the base body, which is designated as ground. The adjacent bodies are 
connected by revolute joints. Joints 1 to 7 are identified as Shoulder Roll, Shoulder Pitch, 
Elbow Roll, Elbow Pitch, Wrist Roll, Wrist Pitch, and Toolplate Roll. 

Since adjacent bodies are connected by revolute joints, one generalized coordinate is 
assigned to each joint. The motion of this system can be described by seven generalized 
coordinates; the dynamic system is thus formulated as a system of seven differential equations. 
When a reference configuration is selected, a linearized model can be generated at this 
configuration using the proposed linearization algorithm. 

The configuration that is shown in Fig. 10 is selected as a reference configuration: the 
angles of all the joints are zero (q-|, q 2 , .... q7 = 0), and the velocities of all the joints are also 

zero (4| , .... Q7 = 0). At this reference configuration, the linearized model that is obtained 
from the recursive linearization algorithm is expressed as 

dx= [m‘ 1 P 2 M^pJ^ + 

where x = ^ q 2 q 3 q 4 q 5 q6 97 Pi Q 2 P3 ^4 % ^6 ^] T > u is ^e actuating torque 
vector, M is the generalized mass matrix that is expressed in terms of joint variables, and 


M- 1 P,J du 


(39) 
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M 1p l = °7x7 


15.737 

0 -1.607 

0 

-.1451 

0 -,18e-8 

0 

15.228 0 

2.8549 

0 

-.3485 0 

.41697 

0 9.2375 

0 

-.4480 

0 -.13e-7 

0 

-11.44 0 

-6.361 

0 

.5637 0 

-29.1 1 

0 3.711 

0 

21.533 

0 .93e-7 

0 

-5.0345 0 

-13.19 

0 

-15.76 0 

.14.914 

0 5.9428 

0 

-25.38 

0 -.58e-7. 


P 3 = l 

In this case, to generate a closed-form analytical expression for the linearized model is 
too difficult for accuracy checking even if the symbolic manipulation is employed. Instead, two 
comparisons were derived to make certain that the linearized model in Eq. 39 accurately 
represents the nonlinear model. In the first comparison, both the nonlinear and linearized 
models were perturbed with the same amount, and then the resulted acceleration changes were 
examined. In the second comparison, two linearized models-one obtained from the recursive 
approach and the other obtained from the numerical perturbation method-are examined. 

In the first comparison, perturbation of the generalized coordinate x by 

1 0‘ 6 incurs the relative error of the acceleration changes between the linearized and 
nonlinear model as 


|| die* -dx || 2 
II dx*|| 2 


= 2.739 e-6 


(40) 


where dx is the acceleration change obtained from the linearized model and dx* is obtained 
from the nonlinear model. 

In the second comparison, a simple numerical perturbation without any convergence 
checking is implemented to generate a linearized model, which will serve as a reference in 
comparing the recursive linearization with the numerical perturbation. Comparing the 
linearized model obtained from the numerical method with those obtained from the recursive 
approach, one can observe that at the given configuration both approaches generate nearly 

identical linearized models, in which the relative difference is less than 10' 6 . 

However, the recursive algorithm proves to be more efficient than the numerical 
perturbation method. In comparison with the numerical perturbation method, the recursive 
linearization took half the epu time to generate a linear model, even though the numerical 
perturbation method used here was a relatively simple one. If a convergence checking 
algorithm was employed for the numerical perturbation method, it would take even longer to 
generate a linear model. 

The simple pole placement used in the previous example was used again to design a 
linear regulator. The desired closed-loop poles were selected to make the simulation results 
similar to the experimental results. After properly selecting the desired poles, we used this 
linear regulator to control the nonlinear dynamic model. The step response of Joint 4 is shown 
in Fig. 1 1 . From this result, it is clear that a simple regulator based on a linearized model 
simulates the behavior of a complicate control system around a reference configuration. 


6. Conclusion 

In these examples, we have shown that the proposed linearization algorithm is both 
efficient and accurate in generating a linear model at given configurations. These linear 
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models are converted to the standard state space forms, which are convenient for linear 
control design. Moreover, the driving force input for a required motion around a given 
configuration can be predicted by using the linearized model. When a large gross motion is 
involved in a prescribed trajectory, more than one linearized model may be necessary for 
robust control. In such a case, the computation of linearization must be fast enough to update 
the linearized model before it fails to represent the system adequately. With the emerging 
parallel processing computers and computation algorithm, the use of successive linearization 
will be possible for on-line adaptive control. 
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Figure 1 . Recursive Derivation Flow for Dynamic Equations of Motion and 
Proposed Linearization Algorithm 
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Figure 6. Response of small Initial Deviation 



Time ( sec) 

Figure 7. Response of large Initial Deviation 




Figure 8. Response of small Initial Deviation Figure 9. Response of large Initial Deviation 
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Abstract 

The paper solves the problem of hinged multibody dynamics using an 
extension of the innovations approach of linear filtering and prediction 
theory to the problem of mechanical system modeling and control. This 
approach has been used quite effectively to diagonalize the equations for 
filtering and prediction for linear state space systems. It has similar 
advantages in the study of dynamics and control of multibody systems. 
The innovations approach advanced here consists of expressing the 
equations of motion in terms of two closely related processes: (1) the 
innovations process e, a sequence of moments, obtained from the applied 
moments T by means of a spatially recursive Kalman filter that goes from 
the tip of the manipulator to its base; (2) a residual process, a sequence of 
velocities, obtained from the joint-angle velocities by means of an outward 
smoothing operations. The innovations e and the applied moments T are 
related by means of the relationships e = (I - L)T and T - (I + K)e. The 
operation ( J - L) is a causal lower triangular matrix which is generated by a 
spatially recursive Kalman filter and the corresponding discrete-step Riccati 
equation. Hence, the innovations and the applied moments can be obtained 
from each other by means of a causal operation which is itself causally 
invertible. The residuals n and the joint-angle velocities q are related by n 
= (/ - K*)q and q - (/ - L*)n in which (/ - L*) is also an anticausal, upper- 
triangular, matrix. Hence, the residuals and the joint-angle velocities are 
related by means of an anticausal operation which is itself anticausally 
invertible. The use of the residuals process is of interest because it 
diagonalizes the composite multibody system kinetic energy. In other 
words, the kinetic energy J fc, q) in the system can be written as J = l/2n T 
Dn in which D is a diagonal matrix. The Lagrangian equations of motion 
that result from this diagonal form for the kinetic energy are completely 
decoupled in the sense that the equation for the residual velocity at any 
given joint is independent from the similar equations at all of the remaining 
joints. The innovations process appears as a driving term in these 
equations. Use of the innovations, in place of the physically applied joint 
moments, decouples the equations even further. The equations of motion 
for joint k involves only the value of the innovations at the same joint. The 
final equations of motion are therefore diagonalized in the sense that the 
equation for any given joint is independent from the equations at the other 
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joints. The diagonal form of the equations of motion results in significant 
simplification of dynamic analysis, simulation, stability analysis, and control 
design. This simplicity is illustrated by arriving a very simple decoupled 
control algorithms for robotic manipulator control. 
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Abstract 

An efficient 0(mN ) algorithm for dynamic simulation of simple closed-chain robotic mechanisms will 
be presented in this paper, where m is the number of chains, and N is the number of degrees of 
freedom for each chain. It is based on computation of the operational space inertia matrix (6 X 6) 
for each chain as seen by the body, load, or object. Also, computation of the chain dynamics, when 
opened at one end, is required, and the most efficient algorithm is used for this purpose. Parallel 
implementation of the dynamics for each chain results in an O(N) + O(log 2 m + 1) algorithm. 

I. Introduction 

Recently, there has been an increasing interest in robotic systems with multiple chains forming simple 
closed kinematic loops. Such systems of interest in space robotics applications include multilegged 
vehicles, multiple manipulators, and dexterous hands. Each is characterized by multiple chains of 
links (legs, arms, or fingers) in support of a body, load, or object. Real-time simulation of these 
systems is important for remote operation, but difficult to achieve at present. An even greater 
challenge to the computational engineer is that of super-real-time simulation , that is, planning seconds 
of motion in milliseconds. This has been shown to be of value in the control of a multilegged vehicle 
when predicting the action of the present control to ensure safety and stability along a planned 
trajectory. 

The fundamental goal of this paper is the development of an efficient algorithm for the dynamic sim- 
ulation of the time- varying topological systems discussed above. Previous researchers have presented 
algorithms for these and similar configurations based on equation augmentation [1], constraint prop- 
agation [2], and recursive computation [3,4], but these methods are often difficult to apply and/or 
computationally inefficient. The new simulation algorithm derived here makes use of efficient com- 
putations for the individual supporting chains to produce an efficient simulation method for the 
complete robot system. The dynamic properties of each chain are described in a simple, physically 
understandable manner, which facilitates the straightforward analysis of the combined dynamics of 
the entire mechanism. 

Multiple chain robotic systems can take many forms, some of them quite complex. Simple closed- 
chain mechanisms are a subset of multiple chain systems with specific structural characteristics. The 
structure of a simple closed-chain mechanism is characterized by m actuated chains which support a 
single common reference member [1]. A supporting chain is identified as an independent functional 
unit in the closed chain system which has two ends, each terminated by a single link. Each chain 
may have an arbitrary number of links and degrees of freedom, and closed kinematic loops within 
a chain are permitted. The removal of the reference member breaks the closed loops formed by the 
multiple chains. 
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Reference 
Member (0) 



Figure 1: Example of a Type 0 Simple Closed-Chain Mechanism 



Figure 2: Example of a Type 1 Simple Closed-Chain Mechanism 


There are two basic types of simple closed-chain mechanisms called Type 0 and Type 1, respectively 
[1]. These two types are defined based on the nature of the interactions which occur between the 
links of each chain and the reference member or support surface. Figure 1 illustrates a typical Type 

0 mechanism which may be used to model multiple manipulators or dexterous hands. Note that the 
support surface, shown here as a fixed inertial frame for a multiple manipulator configuration, might 
also represent the moving “palm” of a dexterous hand. In either case, for a Type 0 mechanism, the 
base link of each chain is connected to the support surface by an actuated joint structure, while the 
last link interacts with the reference member through an unpowered contact. Figure 2 illustrates a 
Type 1 simple closed-chain mechanism which may be used to model multilegged vehicles. For a Type 

1 mechanism, the last link of each chain interacts with the support surface through an unpowered 
contact, while the base link is connected to the reference member by an actuated joint structure. For 
both Type 0 and Type 1 mechanisms, the reference member (object, load, or body) is numbered 0, 
while the chains are numbered arbitrarily from 1 to m. Chain k (k = l,...,m) has jV* degrees of 
freedom, where Nk may be less than, equal to, or greater than 6. 

In order to apply the same algorithm to both types in this work, the support surface will be considered 
to act as the “base” of each chain. We will refer to the terminal link which interacts with the support 
surface as link 1, and the terminal link which interacts with the reference member will be called the 
last link or end effector (link N). The far end of link N is the “tip” of the chain. The interactions 
and connections which occur between bodies or links in the system (including those at the support 
surface and at the reference member) will be described using the general joint model of [5,6]. This 
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includes both powered joint structures and unpowered contacts. The motion of the support surface 
is assumed to be known. 

In this paper, an 0(mN ) recursive algorithm for the dynamic simulation of simple closed-chain 
mechanisms is derived for m chains with N degrees of freedom each. The algorithm is based on the 
efficient computation of the (6 x 6) operational space inertia matrix [7] for each chain as seen by the 
body, load, or object. The operational space inertia matrix, A, may be used to obtain the net effect 
of the chain dynamics at its tip. The computation of the chain dynamics when the chain is open 
at one end is also required, and the most efficient algorithm is used for this purpose. Given 0(N) 
algorithms for these two fundamental computations for each chain [4, 5, 8, 9], an O(mN) algorithm for 
the simulation of the entire multiple chain system is formulated. 

In the next section, the notational and modelling conventions used in the formulation of the new 
algorithm are summarized. In the third section, the dynamic properties of the individual supporting 
chains and the common reference member are discussed, and the appropriate dynamic equations 
are developed. The operational space inertia matrix and the open-chain dynamics of each chain 
are of special significance in this discussion. In the fourth section, the 0(mN) dynamic simulation 
algorithm for simple closed-chain mechanisms is derived. The final algorithm is presented as a series 
of five steps, which are summarized in a convenient tabular form. The computational requirements 
of the new algorithm, including parallel implementation considerations, are presented in the fifth 
section. Finally, the results of this work are summarized and some overall conclusions are given in 
the final section. 

II. Notation 

Many of the notational conventions used in this paper are based on concepts introduced by Roberson 
and Schwertassek in [6] and used by Brandi, Johanni, and Otter in [5]. They are similar in many 
ways to those described by Featherstone in [9] for robot dynamics, although there are a few minor 
differences. As in each of these, spatial notation will be used to develop the dynamic equations for 
the chains and reference member. With spatial notation, velocity, acceleration, and force vectors are 
all 6 x 1 column vectors, where each incorporates the appropriate linear and angular components. In 
this paper, the spatial velocity of the reference member, Vo, is written: 

Vo — [ (^o)x ( w o)y ( w o)z ( u o)r ( w £))y ( v o)z ] j 

where (w 0 )x, (wo)„, and (w 0 )z are the components of the angular velocity of the reference member 
about x, y, and z, respectively, usually resolved in the reference member frame (frame 0) or an inertial 
coordinate system. The three components, (v 0 ) x , (u 0 ) y , and (» 0 )z, represent the linear velocity of the 
coordinate origin of frame 0. Similarly, the spatial acceleration of the reference member is expressed 
as: 

ao = [ («o)x ( a o)y ( a o)z ( a o)x (°o)y ( a o)z ] » 

where the individual components now correspond to resolved angular and linear acceleration vectors. 
Spatial force vectors have a corresponding structure: 

f k = [ (n k )x (nfc)y ( n k)z (fk)x ( fk)y ifk)z ] > 

where, in this case, f k will be used to represent the spatial force exerted on the reference member 
by chain k. The first three components, (n*) x , (n k ) y , and (n k ) z , represent the elements of a three- 
dimensional moment vector, while (/*)*, (/fc)y, and (/*), are the elements of a three-dimensional 
force vector. 
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In general, the transformation of a spatial velocity or acceleration vector from one coordinate system 
to another one may be accomplished by the following spatial multiplication [9]: 

J p = (4) 

where *p is the vector expressed with respect to the ith coordinate system, ■ 7 p is the same vector 
expressed with respect to the j'th coordinate system, and ^X,- is the 6x6 spatial transformation 
matrix. This spatial transformation is defined as follows: 


^X, = 


j Ai 0 

J A, bj ^ 


(5) 


where J A ,• is the 3x3 rotation transformation between the two coordinate systems, and bj is the 
3x1 position vector from the origin of frame t to the origin of frame j, with components expressed 
in frame i. The 3x3 matrix, bj, is an anti-symmetric matrix defined by the rule: 


c = 


0 c 3 c 2 
C3 0 -Cl 

-c 2 ci 0 


( 6 ) 


In spatial notation, inertia matrices are also expressed as 6 X 6 matrices. An inertia matrix may be 
defined for each individual link of a chain, as well as the reference member, in its own corresponding 
coordinate system. For the reference member, this matrix, Iq, is represented as follows: 


Io = 


lo 

h 0 T 


ho 

Mo 


(7) 


where M 0 is a 3 X 3 diagonal matrix of the mass of the reference member, and I 0 is the 3x3 moment 
of inertia tensor at the origin of coordinate frame 0. The matrix ! 0 is symmetric and positive definite, 
but not necessarily diagonal. The 3x3 matrix, ho, is equal to moso, where mo is the mass of the 
reference member, and s 0 is the position vector of the center of gravity of the reference member from 
the coordinate origin of frame 0. Because I 0 and s 0 are defined in coordinate system 0, the matrix 
Io is constant. 


To include general joints and contacts with multiple degrees of freedom in a multibody system, an 
extended model of the interconnections and interactions between individual bodies of that system is 
required. In this paper, the general joint model of Roberson and Schwertassek [6] is used for this 
purpose. This model is also used by Brandi, Johanni, and Otter in [5], 

Briefly, each interconnection and/or interaction between two bodies in a simple closed-chain mecha- 
nism, hereafter referred to as a “general joint”, is described in terms of two orthogonal vector spaces, 
<(> and <j> c . The matrix cj> is of dimension 6 X n, where n represents the number of degrees of freedom 
of the general joint, and it has full column rank. It represents the free modes of the joint, and its 
columns make up a basis for this free vector space. We will refer to <p as the motion space of the 
general joint. The matrix </> c , which is 6 X (6 — n) and also of full rank, represents the constrained 
modes of the general joint. It is orthogonal to </>, and may be called the constraint space of the joint 
Both <j> and 4> c are usually resolved in the joint frame, and thus, they are both constant. 

Dynamic Properties of Individual Chains and Reference Member 

Each chain in a simple closed-chain mechanism is governed by the dynamic equations of motion for 
a single chain. For chain k, k = 1, . . . , m, these are: 
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Tk = Hfc qjt + C k qfc + Gk + J k ffc» 


( 8 ) 


where 


Tk 

q*,<jfc,q* 

Hjt 

c* 

Gk 

Jk 


Nk x 1 applied general joint torque/force vector, 

Nk x 1 general joint position, rate, and acceleration vectors, 
^k X Nk joint space inertia matrix, 

Nk x Nk centripetal/Coriolis matrix, 

Nk x 1 gravity vector, 

6 X Nk Jacobian matrix, 


and f it is the (6 X 1) spatial force vector exerted by chain k on the reference member. 


Note that H*, C k , G*, and J fc are functions only of the general joint position and rate vectors, q* 
and q respectively. Recall also that the “base” of each chain is the support surface, and the “tip” 
of each chain touches the reference member. The components of q* and 7> correspond to the general 
joints of each chain, starting with the joint between link 1 and the support surface and ending with 
the joint preceding link N. The basic unknowns in Eq. (8) are the general joint accelerations, qjt, 
and the components of the force vector, f*, in the constrained directions of the general joint at the 
tip of chain k. 


We may use the dynamic equations of motion to partition the joint acceleration and spatial tip 
acceleration vectors of each chain into the difference of two terms, one known and one unknown. For 
each chain, we may write [11]: 

• q* = (9) 

= (qfc)open - ft* fjfe, (19) 

where (qjt)open is the vector of joint accelerations for chain fc in an open, unconstrained configuration 
(fit = 0), and flk is a function of the joint positions for chain k. Likewise, for X*., the tip acceleration 
for each chain: 


x* = (x fc ) ope „-(JfcH^jf)f*, (11) 

= (xfc) open - A* ‘ffc, (12) 

where ( Xk)open is the spatial tip acceleration vector for chain k in an open, unconstrained configu- 
ration, and Aj^ 1 is the inverse operational space inertia matrix for chain k y defined at the tip of the 
chain [7,11]. 


The open-chain terms, (qjt) op cn and (xfc)open ? are completely defined for each chain given the present 
state joint positions and rates, and qjt, the applied joint torques /forces in the free directions, r*, 
and the motion of the support surface. An appropriate open-chain Direct Dynamics algorithm may 
be used to calculate these terms. The 0(N ) recursive algorithms of [5,10] are very efficient for this 
computation, and the linear order of computation is highly desirable. Because the joint positions 
are known, ft* and Aj^ 1 are also defined. Efficient algorithms for fl* and A^ 1 for a single chain are 
derived in [11], including an O(N) recursive algorithm which is the most efficient for N > 6. 


The dynamic behavior of the reference member may be described using a spatial force balance 
equation for that body. The sum of the spatial forces exerted by each chain on the reference member 
and any other external spatial forces (including gravity) are equal to the resultant force on the 
reference member. Using spatial notation, we may write the force balance equation as follows: 
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( 13 ) 


m 

F 0 = £°f fc + go, 

k= 1 

where 

F 0 = 6 x 1 resultant spatial force vector applied to the reference member, 

°ffc = 6x1 spatial force vector applied by chain k to the reference member, 

and 

go = 6x1 external spatial force vector applied to the reference member 

(including gravity). 

Each force term in Eq. (13) is defined with respect to the coordinate frame attached to the reference 
member (frame 0). Applying the basic Newton-Euler equations, we may also write the resultant 
vector, Fo, as follows: 

F 0 = Ioa 0 + v 0 X Io v 0 , 

— Io&o + bo, 

where 

Io = 6x6 spatial inertia of the reference member, 

ao = 6x1 spatial acceleration of the reference member, 
v 0 = 6x1 spatial velocity of the reference member. 

Both v 0 and ao refer to the motion of the coordinate origin of frame 0. The spatial inertia matrix, 
Io, is also defined at this point, and it is known and constant. Because vo is given for the present 
state, the velocity-dependent term, b 0 , is known. If we combine Eqs. (13) and (15), we finally obtain 
the following dynamic equation for the reference member: 

m 

E °fjfc + go = Io ao + bo 

k=i 

In this equation, the basic unknowns are ao and the components of °f* in the constrained directions 
of the general joint at the tip of chain k . 

IV, Multiple Chain Algorithm 

In developing an efficient algorithm for the dynamic simulation of simple closed-chain mechanisms, 
we are naturally led to consider the relationship between the physical structure of the robotic sys- 
tem and the computational structure of the desired algorithm. Intuitively, it seems apparent that 
the structural parallelism present in a simple closed-chain mechanism should lead to computational 
parallelism in the solution of the Direct Dynamics problem for that mechanism. 

More specifically, in a simple closed-chain mechanism, the to actuated chains act on the reference 
member in parallel, and their motion is coupled with that of the reference member. If the reference 
member is removed, the chains may function independently. Computationally, the physical removal 
of the reference member corresponds to solving for the forces which are exerted on it by each chain. 
Once these forces are known, the system is equivalent to a group of independent chains with known 
tip forces. The joint accelerations may then be computed for each chain separately. Given enough 
processors (one per chain), the computations for each chain may be carried out in parallel. 


(16) 


(14) 

(15) 
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We will illustrate the basic methodology of the new simulation algorithm by first examining a simple 
special case. Consider m manipulators rigidly grasping a common object. Each manipulator has six 
degrees of freedom, and no chain is in a singular position. For simplicity, we will express all of the 
relevant equations in absolute coordinates. Because each chain tip is rigidly attached to the reference 
member, we may write: 

x fc = a 0 ( 17 ) 

for each chain k , k — 1 , Thus, the operational space dynamic equation for each chain, as 

given in Eq. (12), takes the form: 

»0 - (x*)open - (18) 

Because no chain is in a singular position, and each chain has a full six degrees of freedom, Afc is 
defined [11]. We may, therefore, solve for the spatial tip force exerted by chain k on the reference 
member, ffc, as follows: 

fj. = A* [(xfc) 0 p Cn — ao] • (1®) 

With this equation we have established an explicit relationship between the spatial tip force, f*., and 
the spatial acceleration of the reference member, ao. This expression may be used in the reference 
member dynamic equation, given in Eq. (16), to obtain: 


m 

[(x*) ope „ - a 0 ] = Io «o + bo - go- 

*=i 


( 20 ) 


The only unknown in Eq (20) is ao, the spatial acceleration of the reference member. Collecting 
terms, we may write: 


io + 


k=\ 


ao 


m 

£a k (Xfc)open — Bo T go • 
Jk=l 


( 21 ) 


We may now solve for ao from this linear system of algebraic equations using any linear system 
solver. Note that the characteristic matrix is just the sum of the operational space inertia matrices 
of the individual chains and reference member, and is only 6x6. With ao known, we may also 
solve explicitly for the spatial tip force f*., k = 1 using Eq. (19). Thus, the motion of the 

reference member and the spatial force exerted at the tip of each chain are completely defined. The 
simple closed- chain mechanism is effectively decoupled. Each manipulator may now be treated as 
an independent chain with a known spatial tip force. The joint accelerations for each chain may be 
computed separately using an appropriate Direct Dynamics algorithm and then integrated to obtain 
the next state. 


The method outlined above is quite straightforward. Of course, the illustrated example represents a 
special case. We will now develop a similar approach for a general simple closed-chain mechanism. 
Consider a mechanism with m chains, each with an arbitrary number of degrees of freedom, N . The 
interaction between each chain tip and the reference member is arbitrary and will be modelled using 
the general joint model of [6]. To begin, we will derive an explicit relationship between the spatial 
acceleration of each chain tip and the spatial acceleration of the reference member. The spatial 
acceleration of the tip of chain k is denoted by x*. The relative spatial acceleration between the tip 
of chain k and the reference member, x£, resolved in the orthogonal vector spaces of the general joint 
between them, may be written: 

xJb = {4>)kOtk + (<p c )kM k , 


79 

C-2s 


( 22 ) 



where (<j>)k and {4> c )k are the motion space and constraint space of the general joint at the tip of chain 
A;, respectively. The quantities a * and are the corresponding components of relative acceleration 
in the free and constrained directions. For each chain, (<f>)k, ( 0 C )*, and a% are known, while a * is 
unknown. The sum of x* and x£ is just the spatial acceleration of the reference member on the far 
side of the general joint between it and chain A;, a£. Thus, we may write: 

a£ = Xfc + Xfc, (23) 

= Xjt + {4>)k a k + {4> c )k a k- (24) 


We may also express a£ in terms of the spatial acceleration of the reference member, ao, as follows: 


ao — X* ao + Co > 


(25) 


where Xq = ^Xq is the spatial transformation between coordinate frame 0 and the coordinate frame 
associated with the general joint at the tip of chain k. The quantity (q is the 6x1 bias acceleration 
vector which is a function of the position and spatial velocity of the reference member. Because the 
present state of the entire system is given, both Xq and (q are known. 


Equating the two expressions above for a£, we obtain the following: 

x* + (<£)* <*k + ( 4> c )k &k = Xq ao + Co * (26) 

This equation matches the spatial accelerations at the coupling point between chain k and the refer- 
ence member, giving an explicit relationship between x* and ao when the coupling is arbitrary. The 
basic unknowns in Eq. (26) are x*, a*, and ao. All other vectors and matrices may be computed 
rather simply from the initial information given for the simulation problem. 


To decouple the chains and the reference member, we need an explicit mathematical relationship 
between the spatial force exerted by chain k on the reference member, fit, and the spatial acceleration 
of the reference member, ao. Equation (26) relates the spatial acceleration of the reference member 
and the spatial acceleration of the tip of chain k. We may eliminate a *, the unknown components 
of the relative acceleration, by projecting Eq. (26) onto the constraint space of the corresponding 


general joint as follows: 

(<nl [x fc + (4>)k «k + {4> c )k <*%] = {4> c ) T k [x* ao + Co] • (27) 

By definition [11]: 

(d> c )l(4>)k = 0, ( 28 ) 

and . 

(<t> c ) T k (<n k = i. ( 29 ) 

Thus, we may write: 

{<t> c jl *k + a c k = ( <t> c )k [X$ ao + Co] ■ (30) 

Equation (12) defines x* in terms of the desired force vector, f*. If we combine Eqs. (12) and (30), 
we obtain: 

(4> c )k [(x fc ) ope n - A ;%] = (4> c )l [x* ao + Co*] - a ki (31) 


or 
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( 32 ) 


| {<t> C )k *] ffc = [«£ - ( 4> C )k Co + (4^)1 (^fc)open] [(4 )fc X 0 J So- 

The first bracketed term on the right side of Eq. (32) is completely known. The only unknowns in 
this equation are the constraint components of the force vector, ft, and the spatial acceleration, a 0 . 
We may now pursue an explicit relationship between these two vectors. 

Like the relative acceleration vector, i k may also be resolved in the orthogonal vector spaces of the 
general joint at the tip of chain k as follows: 

fjt = (4)k hfc + (4> c )k h£, 

where hjt is the vector of known force components in the free directions, and h k is the vector of 
unknown force components in the constrained directions. Combining Eqs. (32) and (33), we obtain: 

W e )lA?mkh k + (*%hl) = \a c k -(ct> c ) T kCo+(<l> C )l(^)o P en} 

-[(^XS]ao. (34) 

If the spatial acceleration of the reference member is known, we may find an explicit solution for the 
unknown force components at the tip of chain k from the following set of linear algebraic equations: 

[{<t> C )I Afc 1 ^ 6 )*] K = {<4 - ( <t> C jk [Co - (x*:)open + A* W hfc] } 

-[(<n*XS]ao, ( 35 ) 

= S k - [(^l X*] ao, (36) 

where S* is known. Even when a 0 is unknown, we may still find a solution for h c k in terms of the 
unknown ao. The solution will have the following form: 

h% = M k [s k -mlX k 0 a 0 ], (37) 

= M fc S*-[M fc (dOrxS] ao. (38) 

If (n c )fc is the number of degrees of constraint for the general joint at the tip of chain k , then M* is 

the ( n c ) k X (n c ) k transformation matrix which solves for h c k . By carefully considering the rank of the 
coefficient matrix, this general solution can still be used for a chain in a singular position or a chain 
with less than six original degrees of freedom [11]. This solution procedure requires 0[(n c )j.] scalar 
operations. 


Note that if chain k is rigidly grasping the reference member, then the constraint space for this 
general joint, (4> c ) k , is the 6 x 6 identity matrix. In this case, also note that a c k and h* are identically 
zero for each chain. If chain k has six degrees of freedom and is not in a singular position, then M*, 
will be exactly equal to A fc , the operational space inertia matrix for chain k, and the solution for h c k 
will be: 

h£ = A fc [s*-X5ao] . . (39) 

This solution corresponds to the simple example discussed at the beginning of this section, but now 
expressed in local coordinates. 
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Given the general solution for h£ in Eq. (38), the force vector, f*, may now be written: 


i k 


{<t>)k hfc + ( <t> c ) k h£, 

K*)* hfc + 0 i> c )k M, Sfc] - [(<f>% M k {<j> c ) T k X*] a 0 , 
Pfc - 


(40) 

(41) 

(42) 


where P* and R* are of dimension 6x1 and 6x6, respectively, and both may be computed from 
known quantities. We now have an explicit equation relating the force vector exerted by chain k 
and the spatial acceleration of the reference member. We may combine this information with the 
dynamic equation for the reference member to solve for ao explicitly. 


The dynamic equation for the reference member given in Eq. (16) may be rewritten as follows: 

m 

+ go = Io*o + bo- (43) 

k=i v ’ 

where f* is the spatial force exerted by chain k on the reference member, expressed in the coordinate 
frame of the general joint at the chain tip. If the expression for f k in Eq. (42) is used in Eq. (43), we 
obtain: ' 


m 


]C(Xo) T (f*fc — R*ao) = Io»o + bo - 

Jfc=i 


go- 


(44) 


Summing like terms, we may write: 



a 0 = 


f m 


£(XS) r P*-b 0 + g 0 , 
U=i J 


or, expanding R t , 


(45) 


Io + E( X 5) T (^M,(^)f(X5) 


/c=l 


ao = 


m 


£(X*) T P*-b 0 + g 0 . 

k=l 


(46) 


In Eq. (46), the 6 X 1 spatial acceleration vector of the reference member, ao, is the only unknown. 
We may find a solution for ao from the given set of linear algebraic equations using any efficient 
linear system solver. Because the required system solution always involves a 6 X 6 coefficient matrix, 
the computational cost of solving for ao is constant. 


The coefficient matrix of a 0 in Eq. (46) represents the combined inertial properties of all the chains 
and the reference member. The inertial properties of each chain are first projected to the tip of that 
chain by computing the inverse operational space inertia matrix, A* 1 . Along the free directions of 
the general joint which connects the chain tip and the reference member, the projected inertia of the 
chain is not felt by the reference member. Along the constrained directions of the joint, however, the 
corresponding components of A* 1 are reflected across to the reference member. These components, 
spatially transformed to the coordinate origin of frame 0, are combined with the spatial inertia of 
the reference member, I 0 . This combination represents the effective operational space inertia of the 
simple closed-chain mechanism defined at the coordinate origin of frame 0. It is the effective inertia 
felt by the reference member in the present state. The bracketed term on the right side of Eq. (46) 
represents the spatial forces w'hich act on the reference member at the given instant. 

Once the spatial acceleration of the reference member is known, the spatial force vector applied by 
chain * to the reference member is defined by Eq. (42). That is, with a 0 given, we may compute f. 
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f fc = P*-mao. v ' 

Recall that f* is defined with respect to the coordinate frame of the general joint between chain k 
and the reference member. The explicit knowledge of ft allows us to treat chain k as an independent 
chain with a known tip force. We may now solve for the general closed-chain joint accelerations for 
chain k using Eq. (10), repeated here for convenience: 

q* = (q*)open-n*f fc . (48) 

The application of Eq. (48) to every actuated chain in the simple closed-chain mechanism results in a 
complete solution to the Direct Dynamics problem for this robotic system. The next state positions 
and velocities may be computed by integrating the appropriate quantities for each chain and the 
reference member. As discussed in [11], small amounts of negative position and rate feedback may 
be employed to counteract the drift which is a result of the integration process, and which would 
violate the kinematic constraints. 

The algorithm developed here for simple closed-chain mechanisms may be presented as a series of 
five steps. They are as follows: 

1. The Open Chain Solution, 

2. Calculation of the Spatial Acceleration of the Reference Member, 

3. Calculation of the Spatial Chain Tip Forces, 

4. Calculation of the Closed-Chain Joint Accelerations, 

5. Integration for the Next State. 

The fundamental computations required in each of these steps are summarized in Table 1. In Step 1, 
the Direct Dynamics problem is solved for each chain of the mechanism assuming that the reference 
member has been removed and each chain is in an open, unconstrained state. The general open-chain 
acceleration vectors, (q fe )o P en and (x fc )o P en, are computed for each chain, along with the position- 
dependent matrices, and A* 1 . In Step 2, Eq. (45) is used to find an explicit solution for ao, 
the spatial acceleration of the reference member, via linear system solution. The quantities (M* S*) 
and [M fc ( <f> c )l X§], required for both P k and R k , are computed in the determination of the explicit 
relationship between h£ and a 0 . This relationship is found by Unear system solution using Eq. (36), 
with the solution taking the form of Eq. (38). In Step 3, this solution is used in Eq. (47) to solve 
for the spatial force vector exerted on the reference member by each chain. In Step 4, the general 
closed-chain joint accelerations are computed for each chain using Eq. (48), given the spatial tip force 
vector. In Step 5, the appropriate rates and accelerations are integrated to obtain the next state 
positions and rates for the system. 

Note that the first step may be carried out for all chains in parallel, if enough processors are available 
(one per chain). Once the second step is complete and ao is known, the third, fourth, and fifth 
steps may also be carried out for all chains simultaneously. Thus, taking advantage of the structural 
parallelism inherent in the simple closed-chain system has led to parallelism in the computational 
structure of the simulation algorithm. 

V. Computational Requirements 

We will now consider the computational requirements of the dynamic simulation algorithm for simple 
closed-chain mechanisms. First, the number of scalar operations required for each chain of the 
mechanism will be tabulated, followed by the number of operations required to compute the spatial 
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Table 1: Dynamic Simulation Algorithm for Simple Closed-Chain Mechanisms 


Given: h*, (<f> c )k, (</>)*, and with 

Xq, bo, go, Co determined from the reference member state; 

Step 1. Compute (<j[A;)apcn, (^fc)open, O/c, nnd A k ; k = 1, ... y m . 

Step 2. Solve for ao: 

io + f;(x*) r R fc 

*=i 

with 

Pjt = [(<f>)k h* + (<j> c )k Mjt Sk ] , 

R* = [(<A c )*M*(^)£x£], 

S k = a%- {<j> c ) T k [<* - (x fc ) open + A^ 1 {4>) k h*] , 
and where (M* S*) and (<j> c )J XqJ are determined by the solution of: 

(M;')hj = [(*'tfA;>w‘) t ] hj = s t -[(^)J-x‘j 

Step 3. Solve for f*; k = 1, . . . ,m: 
f* = Pfc - R-fcao- 
Step 4. Solve for q*; k = 1, . . . , m: 

<1* = (qi)open - fifc ffc. 

Step 5. Integrate to obtain the next state positions and rates for the system. 


a 0 


£(XS) T P*-b 0 + go 


fc=i 
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Table 2: Computations Per Chain in the Simple Closed- Chain Dynamic Simulation Algorithm 





#Mult. 

#Add. 

Calculation 

#Mult. 

#Add. 

(N = 6, n c = 3) 

(N = 6, n c = 3) 

Qopen > X open 

250 N - 182 

220 N - 167 

1318 

1153 

o.a - 1 

400N - 621 

320 TV - 528 

1779 

1392 

P,R 

g»c 3 + 6|n c 2 + 5 J-n c + 26 

|n c 3 4- 6n c 2 4- |n c + 10 

105 

71 

X T P,X T R 

3 6n c + 20 

36 n c - 24 

128 

84 

f 

36 

36 

36 

36 

q 

6 N 

6N 

36 

36 

Total: 

656N - 767 

546 JV - 659 

3402 

2772 


+ (|n c 3 + 6|n c 2 + 4l|n c + 46) 

+ (£n c 3 + 6n c 2 4- 36|n c — 14) 




acceleration of the reference member. The computational complexity of the complete algorithm will 
then be discussed, and the parallel implementation of this algorithm will be considered. 

Table 2 lists the number of scalar operations (multiplications, additions) required in the simulation 
algorithm for each chain of a simple closed-chain mechanism. The operations are tabulated for the 
case of an N degree-of-freedom serial-link chain with simple revolute and/or prismatic joints only. 
The 0(N ) Direct Dynamics algorithm of [5] is used to compute the open-chain terms, q 0 pen 
Xopen- The 0(N ) Force Propagation Method of [11] is used to compute to and A -1 . All q opeJl , x open , 
to, and A -1 are computed in Step 1 of the simulation algorithm. 

In Step 2 of the simulation algorithm, the spatial acceleration of the reference member is calculated 
using Eq. (45). For this task, X T P and X r R must be computed for each chain. The number of 
operations required to compute P, R, X T P, and X r R are also listed in Table 2. In this case, the 
number of operations is a function of the number of degrees of constraint at the general joint between 
the chain tip and the reference member (n c ). This number can never be greater than six. The 
computational complexity of these calculations is 0(n%) due to the linear system solution required 
in the computation of both P and R (see Table 1). 

The spatial force vector, f , exerted by each chain on the reference member, and the closed-chain joint 
accelerations for the chain, q, are calculated in Steps 3 and 4 of the simulation algorithm, respectively. 
The appropriate equations are given in Table 1. The operations required to calculate these vectors 
complete the table. The operations required for the special case of N = 6 and n c — 3 are given in 
the last two columns of Table 2. This value of n c could correspond to a hard point contact between 
a manipulator tip and surface of a load object when the tip is not slipping. 

Given the computations required for each individual chain, the number of scalar operations needed 
to compute the spatial acceleration of the reference member, a 0 , is given in Table 3. Equation (45) 
is used to obtain the solution, which requires 0{m) spatial additions and a single 6x6 symmetric 
linear system solution. Thus, the number of operations required for ao is a function only of m, the 
number of chains in the simple closed-chain mechanism. The example of three chains (m = 3) is 
given in the last two columns of this table. 

To determine the total number of scalar operations required to simulate the entire simple closed- 
chain mechanism, the number of operations required for a single chain is simply multiplied by m, 
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Table 3: Computations for the Spatial Acceleration of the Reference Member 


Calculation 

#Mult. 

#Add. 

#Mult. 
(m = 3) 

#Add. 
(m = 3) 

a 0 

86 

27 m + 71 

86 

152 


the number of chains, and added to the computations required for ao for the same number of chains. 
Thus, the computational complexity of the complete simulation algorithm is O(mN) for a given value 
of n c < 6. 

The total computational complexity discussed in the previous section only considered the execution of 
the simulation algorithm on a single processor. In order to speed up the simulation, parallel processing 
may be investigated. If a single processor is used for the entire system of m chains, the computational 
complexity of the simulation algorithm is 0(mN) for a given n c < 6. Given ao, all computations for 
each chain may be carried out independently. Thus, if m processors are available, the computational 
tasks associated with each chain may be performed in parallel, and the computational complexity 
of the operations required for the m chains may be reduced to O(N). Of jcourse, the computations 
required to compute ao must also be considered. These operations may also be implemented in 
parallel on the m available processors. Equation (45) requires 0(m) spatial additions to compute ao- 
On (m + l)/2 parallel processors, this task may be carried out in 0(log 2 m+l) operations by using 
the recursive doubling approach [12]. Thus, on m parallel processors, the computational complexity 
of the entire dynamic simulation algorithm may be reduced to O(N) + 0{ log 2 m + l). 

VI. Summary and Conclusions 

In this paper, a general and efficient dynamic simulation algorithm for simple closed-chain mechanisms 
was derived. The algorithm is applicable to both Type 0 and Type 1 mechanisms. Both types of 
mechanisms are modelled in a convenient and general manner through the use of the general joint 
concept. The operational space inertia matrix of each chain is used to project the dynamic properties 
of the chain to its tip where it is coupled to the reference member. By combining the operational 
space inertia of each chain with the model of the general joint at each chain tip, a solution may be 
found for the spatial acceleration of the reference member and the spatial force vector exerted on it 
by each chain. Once the force vectors are completely defined, the system is effectively decoupled, 
and the joint accelerations for each chain may be computed separately. 

The computational complexity of the new simulation algorithm is 0(mN) when implemented on a 
single processor. The linear dependence on N is a significant improvement over previous simulation 
algorithms such as that presented in [1], The computational complexity of the new algorithm may 
be further reduced to 0(N) + 0(log 2 m + l) if it is implemented on m processors in parallel. 
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Abstract 

The dynamic equations of motion of an n-body pendulum with spherical joints are derived to be a mixed system 
of differential and algebraic equations (DAE’s). The DAE’s are kept in implicit form to save arithmetic and 
preserve the sparsity of the system and are solved by the robust implicit integration method. At each solution 
point , the predicted solution is corrected to its exact solution within given tolerance using Newton’s iterative 
method . For each iteration , a linear system of the form J AX = E has to be solved. The computational cost for 
solving this linear system directly by LU factorisation is 0(n s ), and it can be reduced significantly by exploring 
the structure of J. This paper shows that by recognising the recursive patterns and exploiting the sparsity 
of the system the multiplicative and additive computational costs for solving J AX = E are 0{n ) and 0(n 2 ), 
respectively. The formulation and solution method for an n-body pendulum is presented. The computational 
cost is shown to be nearly linearly proportional to the number of bodies. 

1 INTRODUCTION 

The general modeling and formulation of an open-chain multi-body system with spherical joints was presented 
by Chou, Singhal, and Kesavan in 1986 [1]. In this paper, we are interested in a single open kinematic chain 
without branching which is a special configuration of a general open-chain system. Much attention was paid to 
this open-chain system by researchers, such as Armstrong [2], because the system is simple and its configuration 
is similar to robot arms. In Armstrong’s work, he presented an 0(n) algorithm for the computation of robot 
forward dynamics. However, in the conclusion of his paper he stated that his program worked only for the 
joints with three degrees of freedom (i.e., spherical joints). He also claimed that his algorithm can be enhanced 
to include prismatic and revolute joints. In the author’s opinion, once we have joints which possess less than 
three rotational degrees of freedom connecting two bodies, the bodies are rotationally dependent. In this case, 
the equations of motion can not be decoupled easily, and we may not be able to find an 0(n) algorithm for the 
complete calculation of robot forward dynamics unless we use very simple and primitive integration methods 
or other techniques such as parallel computing. In other words, if we use an unreliable integration routine we 
may get invalid solutions. 

A series of n rigid bodies joined sequentially by spherical joints form an n-body pendulum. The bodies are 
allowed to rotate freely in space, but the motion of translation between adjacent bodies is constrained by the 
joint. This sequence of bodies can swing in any direction with one end fixed at the ceiling through a spherical 
joint. Here, we derive the equations of motion of this pendulum in the form of a mixed set of differential and 
algebraic equations (DAE’s) and solve the equations using the robust implicit integration method developed by 
Petzold [3,4]. The DAE’s are kept in implicit form to save arithmetic and preserve the sparsity of the system. 
At each solution point, the predicted solution is corrected to its exact solution within the given tolerance using 
Newton’s iterative method. For each iteration, a linear system of the form JTAX = E has to be solved. The 
computational cost for solving this linear system directly by LU factorization is 0(n 3 ). In order to reduce 
computations, we explore the structure of J and take advantage of the system sparsity. This paper shows that 
by recognizing the recursive patterns and exploiting the sparsity of the system the multiplicative and additive 
computational costs for solving J AX = E are 0(n) and 0(n J ), respectively. 


88 


2 MATHEMATICAL MODEL 


An n-body pendulum was modeled with graphs by Chou and was presented in [5]. Based on this graph-theoretic 
model, the mathematical model was derived as a set of DAE’s and was written symbolically as 


E(X,X,t) = 0 


( 1 ) 


where the vector of unknown variables is 


x = [ R? v? P? s? 


( 2 ) 


The vectors R& and V* are the collection of displacements and velocities of all the bodies, and the vectors P(, 
and S(, are the orientation parameters (we use Euler parameters here) and their derivatives. The superscripts 
”u” and n v” indicate the dependent set and the independent set, respectively. The equations include the 
following six sets of equations: 


E = 


r I 2 (P». S», t) 

L r i 3 (p;. s;, t) 

and they are a total of 14n scalar equations in 14n unknown variables. 


T F(V t , P», S», S», t) 
‘G(R*, Ps, t) 

^(Vt, P t , 

t) 


Si, t) 


(3) 


3 SOLVING THE FULL SYSTEM 


At each solution point, we can solve the system equations E directly, using the implicit integration method 
[3,4] where the predicted solution is corrected to its exact solution within a given tolerance using Newton s 
iterative method. For each iteration, a liner system 

J AX = E 

has to be solved by LU factorization. The matrix J is the system Jacobian matrix which is specified by the 
formula given by Petzold [4]. The vectors AX and E are the vectors of corrections and residuals, respectively. 
Letting the Jacobian of an implicit function F with respect to the variable V be F v i we can write the Jacobian 
matrix J and the vectors AX and E of the full system symbolically as follows: 


' l G R 

0 

0 

0 

‘G P . 

l G P • " 


’ AR» ' 


tG i 

0 

0 

0 

0 

> 

o o o 

‘Gs« 

r I I- 
0 
0 

1 65* 
0 

r r3 
i S* 

0 

‘G P . 

0 

0 

r ii>. 

*G P . 

r t2 

ip. 

r t 1 
ip. 


AV» 

ASf 

AS! 

AP? 

A Til) 

= 

‘G 
T I 2 
1 ’I 3 
r I x 

0 

T F V 

r P s . 


r F P u 

J 


L J 


r F . 


(4) 


Solving the full system means that the system is solved by implicit integration in which the full linear system, 
JAX - E, is solved by LU factorization without reducing its size. However, some structural information in J 
and E can be utilized to reduce computations when they are evaluated. 


3.1 Recursiveness and Special Structures 

Some equations in E possess recursive properties which can be further utilized to reduce computational cost. 
This recursive nature also causes the sub-matrices in J to have special structures which can be exploited to 
minimize the computational cost. 

3.1.1 Translational Kinematic Constraints*^ 

In [5], the translational kinematic equations were derived as 

« 

‘Gi = K hi - R , ?1 - AiT { + Yl AkBlk = 0 

fc— l 
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Let 


j Ai = XI* =1 -A*** 

*> = -At*. 

1 Ao = 0 

then, ( 5 ) becomes *G< = R* ; — R, Pl — + A,- = 0 , or recursively 

*Gj = R*< — R, Pl - Ai Ti + (A i _ 1 +x j ) = 0 


( 6 ) 


When we evaluate the residual vector *G, we only have to compute R* i , R,„, A^ii, and x, for i — 1 , ..., n 
once. Instead of computing A< for each equation, we only compute x* and add the previous A t l ; that is, 
Ai = Ai_! + Xj. 

The Jacobian entries corresponding to the equation *G are l Gp and *Gp. The Jacobian of f G correspond- 
ing to R* can be shown to be a unit matrix easily. Here we show that *Gp has a special structure. It is written 
as 

d *G, & *G, 

■5prr - 

e ‘G, 


l G P = 


Since *G{ is a function of p*, , pj J( 


The matrix in ( 7 ) becomes 


0 *G 


flPi 




0 

#P». 


* P», 


7 p^ 




( 7 ) 


., and p*., for each row *(» = !, ...,n) we have 
0‘Gi 


l G P = 


9 Vi* 


r g‘G. 
TpT 
a‘G> 
TpTT 


»*G. 


= 0; k = * + 1, n 


( 8 ) 


0 

«‘G, 

TpC 

a*G- 

IT 


0 

0 


a*G. 

?P^ 


0 

0 


alG, 

^ P»n 


( 9 ) 


and it is a blocked lower-triangular matrix. Each blocked entry is a 3 x 4 matrix. The entries in each column 
«(* = !, n) in ( 9 ) are derived, in Appendix D.l in [ 5 ], as 


Hence, ( 9 ) becomes 


* Gp = 


1 

= 2 Ei a, - 2 Ei ^ 


1 

= 2 Eiif, k=i + l,...,n 


2-Ei(*i — *i 

) 0 0 - 

0 

2 E\ Ax 

2 E 2(3-2 — *i) 0 

0 

2 E\ 

2 E 2 *2 2£?s(a3 — rj) 

0 

2 Ei ax 

2 E 2 a* 2£?3 A3 

* * 2 E n (a n 


r ») J 


( 10 ) 


( 11 ) 


To compute t Gp, we only have to evaluate 2 E{ *4, 2JE, r j( and 2J5j(aj - r { ) once for i = 1 n. 

The matrix *Gp can be further partitioned into and *Gp* as in ( 4 ). This involves permuting selected 

columns; however, the special structure is retained for these matrices after they are partitioned. 
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3.1.2 Translational Velocity Kinematic Constraints *G 
Translational velocity constraints are written as 

n 

‘G < = V ki - AiTi + XI = 0 


( 12 ) 


*=1 


Let 



then, (12) becomes ‘G 4 = V 6i - + A< = 0, or recursively 

*Gi = V». - Ain + (A<_t + if) = 0 (13) 

Similar to (6), we only have to compute Vj,, AiTi, and in once. Instead of computing A 4 , we compute Ai_i +i, 
recursively to save computations. 

The Jacobian matrices correspond to *G are *Gy, ( G$ , and t Gp in which *Gy can be shown to be a unit 
matrix easily, and *Gs and *Gp will be shown to possess special structures. The Jacobian matrix of G with 
respect to Sj is defined as 


‘G s = 


d *G 

■asT 


9‘G, a *G, 

»s», 6 Sj, 

fl‘G] o‘Gi 


a*G, o' G_ 

a s», a s kj 


a'G, 

TsC 

a»G, 


a‘G„ 


(14) 


9 J 

Since is a function of s*,, 8fc a , . . and s h . } for each row i (t = 1, ...,n) we have 

d'G i 


d *b h 


= 0; k = % 4- 


(15) 


Hence, (14) becomes 


4 G s = 


, g *Qi o 

* s fi . 

a *Ga a *Ga 

*S>, 


, ^ ^ 

L as*, a s * 3 as * 4 


0 

0 


a‘GL . 
a 8* n J 


(16) 


The entries in each column t (t = 1, ...,n) in (16) are derived, in Appendix D.l in [5], as 


= 2 Ei a 4 - 2U 4 r 4 


= 2E<a<; * = i+l,...,n 


(17) 


Comparing (17) with (10), we find that 


•w° 


d‘G 


= *G P 


d s k sp k 

Computing *Gs is not required. Once we compute t Gp, we get 'Gj. 


(18) 
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The Jacobian, *Gp, is different. Since ‘G* is a function of p^, p 5 3 , . . and p^, l Gp is a blocked 
lower-triangular matrix like *Gs and is written as 




a‘G, 

*P», 

0 

0 

0 



tG ~ 

G,> - 5P* " 

»• G, 
*P», 

*‘G a 

*p». 

0 

0 




«*G. 

o'G n 

*‘G n ... 








&Ph» J 


The entries in each column i (i = 1, n) in 

(19) are 

derived, in Appendix D.l in [5], as 


r 

J 8P., 

= 2 Ei 

a* — 2 Ei Ti 





- 2 Ei 


= i -j- 



Hence, (19) becomes 

f 2F?i(ai — rj) 

0 


0 


0 


2E\ 2JE? 2 (a 2 — r 2 ) 

0 


0 

l G P = 

2E\ aj 

2 E /2 a 2 

2£rs(a3 — r 3 ) 


0 


2 Ei ftj 

2 E 2 a 2 


2f?a a 3 

■■■ 2 E n ( 

- 


(19) 


( 20 ) 


( 21 ) 


To compute *Gp, we only have to evaluate 2E{ a t -, 2E{ r*, and 2Ei(eti — r*) once for i = 1, . . . , n. 

The matrices l Gs and l Gp can be further partitioned into t Gs*> *Gs*, and *Gp* as in (4). This 

involves permuting selected columns. However, after permuting columns the special structures are retained for 
these partitioned matrices. 

3,1.3 Torque-Balance Equations r F 


The torque-balance equations for an n-body pendulum were derived as 

n 

T Fi = Tj. - r<Af — g) + ifA? J2 M k{y ih -g) 

Let 


( 22 ) 


h=i 


n = E»=< - g) 

= Af 4 (V K - g) 

r n +i = 0 

then, we can write (22) as r Fj = TJ — r, A^*j + s^Aj = 0, or recursively 


= TJ. — t<aJ + a ,- Af (z{ + Tj + i) — 0 


2 aT, 


(23) 


When we compute the residual of r F, we only have to evaluate TJ., a,-, and a,- A?” r; once for i — 1, 

Instead of computing r,- for each equation, we compute Zj and accumulate it recursively to save computations. 
The recursive term, Zi + r<+i, runs in backward sequence; that is, i = n, ..., 1. 

The Jacobian of r F has three parts: T FV, T Fs, and r Fp. They also possess special structures. First, the 
Jacobian of 'F with respect to V> is written as 


8 r F d r F d r F 
Fv — TTTT- + <r — = <T I— = cr 


d\ h 


av h 


dv h 


a r F, 

»-F, 

»'F, I 


ov» 

«v,. 


a'F, 

a "F, 


oK 

«v,„ 

Q1 F. 

g'g- . 

.. o 



*v, n J 


( 24 ) 
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Since r Fi is a function of V kj , V> i+1 , . . and V fc „, for each column * (* - 1, .... «) we have 


^ r jji 

a — = 0; fc = »+l,...,n 

dv bi 


( 25 ) 


The matrix (24) becomes a blocked upper- triangular matrix and is written as 


T F V = o 


f g r Fi o 'Fi ... o-F. 

»V t , sV,, »V*. 

o ... 

«v k . 


(26) 


The entries of each row * (* = 1, ... ,n ) in T Fy are derived as 


. 0 "F, 
*V ki 


g r F. 

»V k „ J 


<r — * = — crMiTiA.J 4 <t A f; A ^ 


Hence, (26) becomes 


r FV = o - 


<r * = <rMk&i-A.J ; k — i 4* 1 ( ---i n 

» v * k 


Mi(ii — ?i)4j M 2 *i >li 

0 Afj(aj — rj)Aj AfjijAj 

0 0 Afj(«3 — ra).Ag 


(27) 


M n a x A^ 

Af n asA 3 


Hft(®n ) ^4 n 


(28) 


To compute T Fy , we only have to evaluate cr M; (a^ — Ti)A.J and <r Mk&iAj once for i — 1, ..., n. 

The Jacobian of T F with respect to P k is similar to (24). However, since 'Fi is a function of p ti , the 
off-diagonal entries become zero: 


d'F* 0 T F< 


= 0; k = i + 


9 p»i 9 p* k 

for * = 1, Hence, T F P becomes a diagonally blocked matrix and is written as 

0 

I < ’ F “ - 

d r F 


(29) 


r F P = 


r o'F, 
SpJT 

» 


9 P» 


0 

0 


»P*. -I 


(30) 


where 


a f Fj 

9Vbi 


£Tj. _ . , d{A?T t ) 

9 Pt, F ’ 9 p ki 9 p 4i 


9 Tj, 


2rjGj *j 4- 2i<G( Tj 


= {2 IiGi + AGiCf I iGi - 4(s IvbJIiGi - 2^G,} 

— 2t{G{ *i + 2ajGr; Tj 

where ^ = 2I j G i p i . and ^ E ^ x are defined in Appendix D.2 in [5]. 


( 31 ) 
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( 32 ) 


* F s has a structure identical to that in *Fp. The oiF-diagonal entries are zero: 


d r F k 8 T F k 

+ < 7 - 


d T Fi d T F i n . . , 

+ <r — — = 0; * = t + n 


8 s ki 8 s tj 8a ik ' 8 i kl 

for i = 1, n. The Jacobian * F $ is diagonally blocked and is written as 

0 

8 r F — ‘ " 


T F S = 


8 S, 


+ <r- 


tr _ f 

Jsl 


0 -f* (7 1 — * 


4^- +<r££- 

0S lfV 


(33) 


where 


d^Fj 
8 s hi 


tTFj _ 8*T b . a dTT *i 


8 k bi 8 Si 4 8 k bi 

= {iGiGfliGi - 4(s£p h .)IiGi + 2 - <r(2 !&) (34) 

The derivation of (31) and (34) is given in Appendix D.2 in [5]. 

3.1.4 Equations Concerning Euler Parameters: r I 1 , r I 2 , and r I 3 

When we compute the residuals of * I 1 , r I 2 , and r I s , there is no applicable recursive structure that can be 
utilized. However, their Jacobians do have some special structures because these equations are derived for each 
body. Matrices with diagonally blocked structures are to be expected. 

First, the normality constraint for each set of Euler parameters is written as 

r I i = P£p»i -1=0 (35) 

Since * 1} is a function of only, we have 

8*1} 8*1} , , , 

j-r = jzr = 0: * = i+ 1 n 36 

& Pb h o 

From (36), the Jacobian matrix of r I 1 with respect to p* lf p* a , . . ., and p* w must be a diagonally blocked 
matrix. The same argument can be applied to T I 2 and r I 8 such that their Jacobians with respect to Pj and 
S* are diagonally blocked. 

Since the blocks concerning these three sets of equations are locally processed, as shown in [5], the Jacobian 
matrices T Ip*, r Ip*, and r Jp. possess a similar structure which is illustrated as follows: 




€i 0 0 ■ 

0 ... 0 

0 6 ... €n 

r r T 2 i 


<Ti 0 ... 0 

0 *2 ••• 0 


= 

... . 

L J 


o o • ••«•„ 



i/\ 0 ••• 0 

0 i/2 . . • 0 

. o 6 ... u n . 


(37) 


where 


\ "i 


[di ft] 

~cr 0 

0 —<r 

0 0 

[ai h* Ci] 


0 1 
0 
-<r 


are defined in Appendix B.l in [5]. 
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4 SOLVING THE REDUCED SYSTEM 

Solving the reduced system means to solve the linear system using the technique of dimension reduction. 
° . .. * < w a . ii (a\ nivac 


Ju 

Jl2 

' AXr 1 


Ei 

J 21 

J 22 

ax 2 J 




Since J u is a unit upper-triangular matrix and is non-singular, we can obtain 

' J u J 12 1 f AX! _ F Bi ' 

0 Jr J [ AX 2 L Er . 


For the independent corrections AX 2 , we solve 

J aAX 2 = Er 

by LU factorization where 

f J R ~ J 22 — J 21^X1^12 

\ E r = E 2 — (^21^11 )Ei 

The dependent corrections AXi are obtained by backward substitutions: 

AXx = JjftE!- Ji 2 AX 2 ) 


4.1 Solving for Independent Corrections 

The independent corrections AX 2 are obtained by solving the linear system JrAX 2 = E R . The evaluation 
of J R and 'Er is accomplished by developing a set of formulae such that the sparsity of the linear system is 
completely utilized. From (4), we derive the following formulae for Jr and Er: 

Jr = r F p* — ( T Fy)( t Gp .) — ( r ^4)( r Jp») 

- - r* 6 )( r i)>*) ( 41 ) 


and 

where 


'F - ( r F v )*G - ( r !P 4 ) r I 2 - ('#5) 'I 3 - 


( r * 4 = r F s - - ( r -Fv)(‘G5-) 

< r *5 = T F S • - ( r F V ')(‘G s .) (43) 

[ = r F P . - ('JV)(‘G P -) 

When we compute Jr and Ep using the above formulae, several matrix multiplications will be performed. 
The special structures discussed in the previous section can be exploited to reduce computation cost. 


4.2 Solving for Dependent Corrections 

Once we obtain the independent corrections, AX 2 , the dependent corrections, AXi, can be computed by 
J 11 (Ei — F 12 AX 2 ). However, using this formula directly is not practical since J u has a very simple structure 
with many zeros. In order to utilize the sparsity completely, a set of formulae are derived symbolically for the 


dependent corrections. They are as follows: 

AP£ = 'I 1 - ( r /J>»)APj (44) 

ASJ = T I 3 - { T I a P .)APl (45) 

ASj = T I J - ( r Jp.)APJ (46) 

AVi = ‘G — (‘G s -)AS? — (*Gs»)ASj 

- (‘Gp.)AP? - (*Gp.)APJ (47) 

AR„ = ‘G - (*Gp»)AP, - (‘G P .)AP; (48) 


Again, when we compute the dependent corrections using the above formulae, the special structures discussed 
in the previous section can be exploited. 


95 




5 LINEAR COMPUTATIONAL-COST SCHEME 


Rewriting the full system (4) by 

• combining S£ and SJ to form S in the original order 

• combining P£ and PJ to form P in the original order 

• combining r I 1 and r I 3 to form I 

• combining — 2cr r I 2 (scaled by — 2<r) and T F to form F 

• re-arranging the columns and rows 


• dropping the subscripts and superscripts for clarity and simplicity 


gives 


or 


Gp Gp 0 0 


■ AR ' 


G 1 

0 Ip 1$ 0 


AP 


I 

0 Fp Fs Fy 


AS 

— 

P 

. 0 G P Gs Gy „ 


AV 


. G . 


U G P 0 0 1 


' AR ' 


■ G ' 

0 U Is 0 


AP 


I 

0 Fp Fg Fy 


AS 

— . 

F 

. 0 G P G s U 


AV 


. G . 


(49) 


(50) 


where Gp and Gy were shown to be unit matrices. In order to make Ip a unit matrix, Gaussian elimination 
has to be applied to Jp, /$, and I locally. An example for one body is given in Appendix B.2 in [5]. We intend 
to solve this linear system with a computational cost which is linearly proportional to the number of bodies in 
the system. 


5.1 Basic Theory 

Applying the technique of dimension reduction to (50) gives 


U Gp 0 0 ' 


■ AR • 


’ G ' 

0 U Is 0 


AP 


I 

0 0 Jp 0 


AS 


Ep 

* 

O 

O 

o 


AV 


. ®Ji . 


where 

( J R = U — (Gs — Gpls) JJ 1 F v 
\ E* = 6 - (G P ) I - {G s -GpI s )Ji 1 (F-FpI) 

and 

f J F — F 5 " F p Is 
\ Ep = F - (Fp)I - (Fv) AV 

Instead of solving (50) directly by LU factorisation, we solve 


(51) 


(52) 

(53) 


J*AV = Ej, 


to obtain AV; then we solve 

J pAS = Ep 


to obtain AS. The rest of corrections can be computed by 

/ AP = I - (J 5 )AS 
\ AR = G - (G P ) AP 


(54) 

(55) 


(56) 
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5.2 Invertibility 

In [6], we have shown that J R is non-singular if J is non-singular. From (52), we have to find J F in order to 
evaluate Jr. We would like to show that Jp is invertible when <r — ► oo. 

The matrix Jp is diagonally blocked because it is computed by F s — Fp Is where F s , Fp, and •* 5 f re 
all diagonally blocked matrices. Each block at the diagonal position in Jp is computed by its corresponding 
block in F s , Fp, and Is- Hence, we write 

J Fi = F Si - F Pi I Si ( 57 ) 

for each diagonal block. From (B.12) in Appendix B given in [5], we find that I Si is a matrix scaled by £. 
Hence, we may write 

F Pi I Si = -Fi 


Also, from (34) we find that F Si may be written as 

F Si = -<r(2IiGi) + 


Hence, we can write 

Jp. = - <r(2 IiGi) + r< - ir. 

(58) 

If <r — * oo, we have 

JFi ** — <r(2T|<?i) 

(59) 


However, we have added one more equation into the F block as mentioned previously. This increases the 
number of equations in each block in F from three to four. The equation is r I^ scaled by 2<r, that is, 


- 2<r r I? = — 2 apf »i = 0 


(60) 


The Jacobian corresponding to s is 

- 2,7 p? (61) 

Combining it with (59) gives 

J Fi » -<r(2J i P < ) (62) 

Adding equation (60) into block F is equivalent to using Euler’s equations in quaternion space [7]. We 
have proved that the coefficient matrix with respect to s in Euler’s equations for inertial torque in 4-space is 

a non-singular matrix [7], This matrix is exactly the same as 2Jt(p*) T in (62). Therefore, as well as Jp 
are invertible if a — ► oo. 


5.3 Summary of Computational Cost 

The detailed discussion of the linear-cost scheme was presented in [5], In terms of multiplicative ( x ) and additive 
(+) operations, the computational cost required for solving the reduced linear system for the linear-cost scheme 
is summarized also in detail in [5] as Tables 2 to 6. Adding up the totals in the tables gives the grand total of 
operations required for the linear-cost scheme: 

f x : 657n — 324 

\ + : 15n 3 + 574n - 321 1 ' 

This cost is not for obtaining one solution point; it is the cost of solving the reduced linear system for one 
Newton’s iteration. The computational cost for reaching a solution point depends on output step size and the 
number of iterations. 


6 IMPLEMENTATION AND RESULTS 

A program, called LINPEN (LINear-cost scheme for simulating an n-body PENdulum), was coded to sim- 
ulate a pendulum. The program is equipped with modules for reporting various physical quantities such as 
displacement, velocity, and acceleration and was run on a VAX 750 computer. The motion of a user-specified 
pendulum can be displayed on a GRIN NELL display terminal. 

Each body is drawn as a standardized 3-D polygon and is projected in perspective, on the terminal screen 
according to its simulated position and orientation. An example of simulating a 3-body pendulum for ten 
continuous positions is illustrated in Figure 1. The figure looks exactly the same as displayed on the display 
terminal. 
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Figure 1: Simulation of a 3-body pendulum. 


6,1 Comparative Study in Computational Cost 

For each solution point, a linear system, J AX = E, may have to be solved by LU factorization several times. 
There are four schemes available to accomplish this: 

• solving the full linear system by LU factorization where J is approximated by numerical difference 

• solving the full linear system by LU factorization where J is evaluated by the exact Jacobian matrix of 

E 

• solving the reduced linear system by LU factorization where J is evaluated by the exact Jacobian and 
is further reduced to its optimal size using the technique of dimension reduction 

• solving the reduced linear system by LU factorization where J is evaluated by the exact Jacobian and 
is further reduced to its optimal size using the linear-cost scheme 

For each iteration, the analytical count of operations required to solve the full system, reduced system, and 
the linear-cost system with exact Jacobian is tabulated in Table 1. Although the additive operations for the 
linear-cost system is 0(n 2 ), we may consider the cost linear because the amount of time required to perform a 
multiplication or division on a computer is about the same and is considerably greater than that required to 
perform an addition or subtraction. 

The simulation of an n-body pendulum was run for the four systems. For each system, the program was 
run ten times from one body to ten bodies. One hundred solution points were generated, for each run, with an 
output time-step of 0.01 seconds and a tolerance of 10“ 7 . Total CPU time for solving four systems is calculated 
in terms of CPU time per residual-call or CPU time per Jacobian-call. Total CPU time per residual-call versus 
the number of bodies is plotted in Figure 2. 



Table 1: Analytical count of operations required for solving three systems with analytical Jacobian 
matrices. 
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7 REMARKS 


The mathematical model of an n-body pendulum with spherical joints and its solution method have been 
presented. The mathematical model derived here is a mixed system of differential and algebraic equations 
(DAE’s) in implicit form. A model of state-space equations can be derived if we do further complex substitu- 
tions. However, in the form of DAE’s the equations of motion provide more sparsity, and this sparsity can be 
exploited when numerical solution methods are applied. The modeling and formulation of an n-body pendulum 
is the first study, for its similarity to a robotic manipulator in structure and for its simplicity in equations. 

We also present four solution methods for solving the equations of motion of this n-body pendulum. The 
first method solves the linear system, J AX — E, directly by LU factorization where J is approximated 
by numerical difference. The second method solves the linear system directly by LU factorization, but J is 
evaluated by its analytical Jacobian matrix with some structural consideration. 

The third method employs the technique of dimension reduction to transform the full system to its reduced 
system, Jr AV = E r, such that LU factorization is performed on the reduced system only. This technique 
utilizes the sparsity of the system completely and saves a considerable number of computations. The four 
methods also employ the technique of dimension reduction to reduce the size of the system. However, they 
further exploit the structure of the system such that the reduced Jacobian generated possesses a very special 
structure which can be utilized to solve the system in a nearly-linear computational cost. 

The comparative study in computational cost for these four systems in the paper gives strong evidence of 
the success of the theory developed. 
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Sliding Control of Pointing and Tracking with 
Operator Spline Estimation* 

Thomas A. W. Dwyer III 1 Fakhreddine Karray and Jinho Kim 
3rd Annual Conference on Aerospace Computation Control 


Abstract 

It is shown in this paper how a variable structure control technique 
could be implemented to achieve precise pointing and good tracking of a 
deformable structure subject to fast slewing maneuvers. The correction 
torque that has to be applied to the structure is based on estimates of upper 
bounds on the model errors. For a rapid rotation of the deformable 
structure, the clastic response can be modeled by oscillators driven by 
angular acceleration, and where stiffness and damping coefficients are also 
angular velocity and acceleration dependent. By transforming this "slew- 
driven" elastic dynamics into bilinear form (by regarding the vector made 
up of the angular velocity, squared angular velocity and angular 
acceleration components, which appear in the coefficients as the input to 
the deformation dynamics), an operator spline can be constructed, that 
gives a low order estimate of the induced disturbance. Moreover, a "worst 
case" error bound between the estimated deformation and the unknown 
exact deformation is also generated, which can be used where required in 
the sliding control correction. 


^Supported in part by SDIO/IST and managed by AFSOR under contract 
F 49620-87-00 1013, and by NASA grant NAG- 1-613 
"Professor of Aeronautical and Astronautical Engineering, University of 
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A Survey on the Structured Singular Value 
Andy Packard 
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and 
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Abstract 

The structured singular value, p, is an important linear algebra 
tool to study a class of matrix perturbation problems. It is useful for 
analyzing the robustness of stability and performance of uncertain, 
(nominally) linear systems. Computation of (M) is difficult, and 
usually, upper and lower bounds are all that can be reliably 
computed. Upper bounds give conservative estimates of the sizes of 
allowable perturbations. The maximum singular value of a matrix M 
is an upper bound for (M). As an upper bound, it can be improved 
by finding a transformations to the data (ie. M ) which do not change 
the structured singular value, but do reduce the maximum singular 
value. Typically, upper bound algorithms involve searches over sets 
of transformations to yield the tightest bound. Lower bound 
algorithms produce small, destabilizing perturbations. In general, the 
algorithms are intelligent searches for minimum-norm solutions to 
multivariable polynomial equations, and are based on various 
optimality conditions that hold at the global (and, unfortunately, 
some local) minima. This paper reviews the current methods to 
compute both of these types of bounds, covering theoretical 
justification and extensive numerical experience with the various 
algorithms. 
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Algorithms for Computing the Multivariable 

Stability Margin 

Jonathan A. Tekawy* Michael G. Safonov^ and Richard Y. Chiang * 


Abstract 

Stability margin for multiloop flight control systems has become a critical 
issue, especially in highly maneuverable aircraft designs where there are in- 
herent strong cross- couplings between the various feedback control loops. To 
cope with this issue,we have developed computer algorithms based on non- 
differentiable optimization theory. These algorithms have been developed for 
computing the Multivariable Stability Margin (MSM). The MSM of a dynam- 
ical system is the "size” of the smallest structured perturbation in component 
dynamics that will destabilize the sytem. These algorithms have been coded 
and appear to be reliable. As illustrated by examples, they provide the basis 
for evaluating the robustness and performance of flight control systems. 


1 Introduction 

Accurate knowledge of the dynamical model associated with the design of modern 
flight control system is becoming more difficult to obtain. This is especially true for 
the design of the next generation fighters where many of the performance specifica- 
tions go beyond the capability of the aircraft currently in service. Robust control 
analysis methods have received considerable attention in recent years as a possible 
solution to the problem of controlling systems for which the given model contains 
significant uncertainty [Saf 1] [Doyle2]. The central feature of these methods is their 
effectiveness in handling an unknown-but-bounded class of plants, instead of the nom- 
inal plant only. 


’"Flight Control Research, Aircraft Div., Northrop Corporation, Hawthorne, CA 90250 
*E. E. - Systems Dept., University of Southern California, Los Angeles, CA 90089-0781 
* Flight Control Research, Aircraft Div., Northrop Corporation, Hawthorne, CA 90250 
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The uncertainty in the nominal plant arises from several different sources: For 
gain-scheduled aerospace vehicle control systems, typical uncertainties in the plant at 
each design point consist mainly of modeling errors due to uncertain aerodynamic co- 
efficients, linearization, model reduction, neglected dynamics, time-delays, etc. Aero- 
dynamic coefficients developed from wind tunnel testing or computational fluid dy- 
namics usually are different from those obtained from actual flight data. Linearization 
will also affect the nominal plant behavior. Nonlinear effects such as actuator satura- 
tion and rate limits are neglected altogether when a model is linearized. Parameter 
drift will also affect the nominal plant. There may be dynamical modes which are 
intentionally or unknowingly neglected. In addition phase loss which results from 
time delay also leads to an uncertainty bound. 

The uncertainty may be loosely classified as falling into two categories, structured 
and unstructured. Structured uncertainty arises from specific component or parame- 
ter variations. Two examples of structured uncertainty are variations in weight and 
drifting aerodynamic parameters. Unstructured uncertainty is any other sort of uncer- 
tainty which can be regarded as a frequency-dependent norm bounded perturbation 
matrix. High frequency modeling errors are one type of unstructured uncertainty. 
The linearization of the nonlinear equations of motion contribute to both classes of 
uncertainty. Actuator rate and position limits have a distinctive signature which can 
easily be isolated, so that they fall into the class of structured uncertainties. On the 
other hand, the effects of nonlinear kinematic terms can only be bounded, therefore 
necessitating an unstructured uncertainty representation. 


2 Robustness Measure 

Safonov [Saf 2], who built upon different, but related, conic-sector nonlinear stability 
theory work of Zames [Zames], reinterpreted the conic-sector stability concepts in 
order to deal with uncertainty and robustness issues. Safonov, Doyle and Fan, to 
name a few, have contributed to the continuing development in this area [Saf 3] 
[Saf 4] [Doyle2] [Fan] . 

Basically the robustness measure is done by lumping uncertain deviations from a 
nominal system M(a) into an uncertain matrix A(a) resulting in an uncertain feedback 
system with loop transfer function A(s)M(j) as shown in figure 1. 

Then, the Multivariable Stability Margin (MSM), ” K m ” , is defined as the smallest 
stable, norm-bounded perturbation A(a) that can destabilize the system. While A m 
is in general difficult to compute, a reasonably tight lower bound K. m theoretically 
can be computed using diagonally scaled singular values [Saf 3] [Doyle2] [Fan]. The 
plot of K m (w) vs. frequency identifies tolerable levels of parameter uncertainty as a 
function of frequency. 
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A=diag[A 1 ,A„....,A l ,..,A,] 

Figure 1: Robustness analysis model. 

For unstructured uncertainty, the maximum singular value has been shown to be 
useful in bounding the multivariable stability margin. However, the bound can be 
very conservative in the case of structured uncertainty. The singular value analysis 
will attempt to find the worst direction of the uncertainty that in reality impossible to 
exist. To deal with the case of diagonally structured A, Safonov [Saf 4] introduced the 
two-sided structured Multivariable Stability Margin (MSM), denoted K m , and Doyle 
[Doyle2] introduced the term Structured Singular Value (SSV), denoted fi, to describe 
the reciprocal, /x(M(a)) = 1 / K m (M(a)). Diagonal perturbations are quite general 
and flexible if one considers parametric uncertainties (e.g. aerodynamic coefficients). 
Traditionally one defines K m and fi for ’’two-sided” magnitude-bounded uncertainties 
which may be either positive or negative; but in cases where the sign of the uncertain 
A; is known a priori one may modify the definitions of K m and fi accordingly. 

When the uncertainties are known to cover both positive and negative pertur- 
bations, the SSV of Doyle and MSM of Safonov provide a ’’tight” (to within 15%) 
condition for robust stability. This condition is measured by representing directly the 
individual sources of uncertainties in the form of block diagonal perturbations. 

Definition: Given transfer functions G(a),a(s) and fe(s), we write 

G(s) € sector[a, 6] 


if 

where 

and 


IG^'w) - G(ju;)| < |r(ju>)| Vtu 
C(s) = (a(a) 4- b(s))/2 
r(s) = (a(s) - 6(s))/2 


Assuming M(s) and A(s) to be stable then the one-sided MSM, ” K mi ” and, the 
two-sided MSM, ” K m] n are defined by the following (see figure 2 and 3): 
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• One-sided K m : 

The system is stable for all A with 

A< G sector[0, K mi ) V* = 1, •••,«. (1) 

• Two-sided K m ■ 

The system is stable for all A with 

A i £ aector[-K mi ,K mi ] V* = 1, (2) 

For any diagonal matrix D, a practical upper bound on fi = l/if TOJ is <r m „(DMD -1 ). 
Further, it is known that for 3 or fewer A i ’s that the minimum over D of this 
upper bound is actually equal to fi [Doyle2]. Safonov and Doyle proved that the 
minimization problem of P~*) is convex in D' = log(D), so that every local 

minimum is a global minimum. Furthermore, computational experience has shown 
that minimum of <j max (DM D~ l ) over D is within 15% of fi. So, we choose to work 
with this upper bound, if ~^ T to calculate the reciprocal of two-sided MSM. Practical 
upper bound for the reciprocal of one-sided MSM can be easily derived by using conic 
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Figure 4: 


sector property of Zames [Zamesj; viz. K m x is bounded above by 

iC, = rnax[)fnin D \ mam {DMD- 1 + (tfMZT 1 )*)^] (3) 

By modifying the former equation to include the permutation matrix , a less 
conservative bound for the two-sided real MSM, K mr , is given by 


Km r — max[^max^.minDX m am{DMif>iD 1 + (DMfcD* 1 )*), 0] (4) 


Here 


<t>i € » = 1, • • • ,2" 

and $ is the set of ail permutation nx n diagonal matrices. 


$ o dta<7[(±l), ,(±1)] 

The bound (4) is similar to the one proposed by Jones [Jones]: 

min Umax M (j>iD~ x + (DM^D' 1 )*) (5) 

mt 

Although, equation (5) will lead to more conservative bounds. Geometrically equation 
(4) is shown in figure 4. 

It should be mentioned at this point that several software packages are available 
to compute the two-sided MSM , however , they are not accessible to the authors to 
be evaluated. 


3 K m Computation 

The computation of cr max is straight forward using available numerical software (Lin- 
pack). For both one-sided, real and two-sided cases, we have to solve an optimization 


problem. However for all of these problems, the analytical gradient is available, so 
accurate solution can be obtained. However, when several eigenvalues or singular 
values coalesce (i.e., have multiplicity greater than one) the function is nondifferen- 
tiable (” creases” produce direction-dependent derivatives), so that a more complex 
algorithm of computing a descent direction is required. Before actually solving either 
case, one can approximately prescale the system matrix AT by substituting for M the 
matrix DMD~ l where D minimizes the Frobenius norm of DMD~ l [Osborne]. 

The monotonic transformation of D — * D ' , with D = Exp(D'), transforms the 
problem into a well behaved convex optimization [Saf 5]. The initial guess for D was 
taken to be equal to the identity. This initial guess was used only for the first frequency 
value in the given range. The solution obtained for a particular frequency point was 
then used as an initial guess for the next value. Suppose that the largest eigenvalue 
is simple, then a descent direction is calculated directly using the Davidon-Fletcher- 
Powell technique. In the case that the largest eigenvalue has multiplicity greater than 
1 and the function is not continuously differentiable, a generalized gradient is used 
to determine a descent direction. Once this is done, the minimal point can be found 
in the specified descent direction by using a well known "binary search” algorithm 
of Bolzano. These steps are repeated until the global minimum iB located (i.e. gra- 
dient is zero). Convexity of c^, ait (e D Me~ D ) and A mo *|(e £> Me~ D + ( e D Me~ D )’) 
ensures that this procedure is convergent to the global minimum. These steps can be 
summarized as follows: 

1 . Initialize M\ = M\ D[ = 0; k = 1 . 

2. Scale M^+i = e D *A/fce“ c *; set D k+ r = 0 

3. Find the search direction 

— Davidon Fletcher Powell (DFP) deflected gradient. 

— DFP generalized gradient for multiplicity > 2 

4. Unidirectional search. 

— Method of Bolzano (Fig. 5). 

5. D' k+1 <— D' k+ j-f- stepsize * search direction. 

6. k = k 4- 1; go to step 2. 

Step 5. of the algorithm involves varying the diagonal scaling matrix D' along a line 
by adjusting the scalar parameter stepsize. The size of e Dt +‘ could approach the 
value of oo. To prevent this, as the stepsize grows, A/fc+i, is repeatedly updated to 
A/*. + i <— e D *+‘ Me~° k + l and stepsize and D' k+1 are reset to zero. This is done as often 
as needed to prevent numerical overflow when e 1 ^ 1 is evaluated. 
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3.1 Generalized Gradient. 

This discussion is not at all self contained and only key results will be stated. An 
excellent reference for this algorithm is [Polak]. In the case where the greatest eigen- 
value/singular value has multiplicity greater than one, the function ceases to be dif- 
ferentiable. In this case the gradient is not defined and more complicated ” generalized 
gradient” methods must be used to compute the descent direction. The generalized 
gradient at a nondifferentiable point is defined as the nearest point to the origin 
in the convex-hull of the set of directional derivatives at neighboring points; thus 
the computation of the generalized gradient at any point is itself a convex nonlinear 
programming problem. We employ an algorithm similar to that of [Doyle2, Polakl] 
to compute the generalized gradients of ^a *( e ° Me~ D ) and ^mai 2 (e D 'Me~ D + 
(e D 'Afe“ cl )’). Geometrically the algorithm is shown in figure 6 and summarized as 
follows: 

• Generalized Gradient is defined as: 

V flen t N r (Co{V(x ) | ||x|| = 1}) 


where 

Co(.) - the convex hull of the set (.). 

Nr(.) - the nearest point to the origin of the set (.). 
{V(x)| ||x|| = 1} - the set of directional derivatives. 

• Iterative algorithm for computing V pen : 

1. Initialize k = 1. 

2. Guess Xk and set y*. = V(xfc). 
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Figure 6: Generalized gradient. 

3. Find ifc+i by minimizing (yjj V(*fc+i)) subject to ||**+i|l = 1. 

4. Find y k +i = Nr( Co(yk, V(x fc+ j)). 

5. Increment k <— k + 1, go to step 3. 



3.2 Davidon-Fletcher-Powell Scaling. 

The unmodified generalized gradient determines a steepest descent direction. The 
steepest descent direction is simply minus the generalized gradient. Steepest descent 
usually works quite well during early stages of the optimization process but if the 
Hessian (second derivative) matrix has a large condition number, the method usually 
behaves poorly, and small zig-zagging steps, called "stitching”, take place (see fig. 
7). Stitching problems also occur when the multiplicity of <r max or A max is 3 or 
more. Therefore we use the Davidon-Fletcher-Powell (DFP) method to modify the 
generalized gradient in order to handle this phenomenon. This technique uses the 
previously calculated generalized gradient to estimate the Hessian and effectively 
rescale the function to make its Hessian better conditioned. This quadratic fit method 
requires fewer gradient evaluations and tends to converge faster. It should also be 
noted that the likelihood of stitching-induced premature termination of the algorithm 
(as can occur in the unsealed steepest descent technique) can be greatly reduced with 
the DFP scaling. 

4 Lateral Directional Flight Control Example 

4.1 Example 1 
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Figure 7: Minimization paths. 



Figure 8: Axis systems and sign convention 
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Figure 10: One-sided actuator uncertainty model. 

Nonlinear elements of actuators can be treated as linear conic-sector elements with 
structured uncertainty. This uncertainty can be modeled as one-sided or two-sided 
uncertain gains within the actuator model. This can be shown through an example. 
Consider a saturation curve below. If the input size is always less than C , then the 
saturation element is equivalent to a gain element with a magnitude of one, however, 
if u exceeds (7, one may model the saturation element as a two-sided uncertain gain 
A in parallel with a nominal gain of one as shown in figure 9. A better approach is to 
model the saturation element by a gain with a nominal value of one and a one-sided 
negative uncertainty as shown in figure 10. Clearly, the one-sided model will produce 
less conservative margins than the two-sided model. 

A design example is presented below in which MSM algorithm is asked to check the 
robustness of a typical lateral/directional flight control systems with respect to the 
actuator uncertainty (e.g. position saturation) and the reduction in the effectiveness 
of all control surfaces. The state-space matrices are given in figure 11. The controller 
uses roll rate, P, yaw rate, R , and the lateral acceleration, 7V v , for feedback (see figure 
12). By putting “extender wires” on the uncertainty blocks A* and pulling them out 
into a separate “block”, one can check the system robustness. 

The plot of K mi j Km 7 aR d ^mox a ^e in figure 13. Note that the cr mag and Km 7 
which have the minimum values of .015 and .42 respectively are equal or less than 
K mi for all frequency, and therefore shown to be more conservative than one-sided 
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n 

B 

c 

D 


-3.4d-1 -4.8d-1 -2.8d*2 3.2d*l 0.0d*0 
5.2d-3 -3.0d*0 4.6d-1 -8.3d-6 0.0d*0 
1.5d-2 -1.4d-f -1.9d*0 -7.5d-7 0.0d*0 
0.0d+0 1.0d + 0 0.0d+0 0.0d*0 0.0d+0 

0.0d+0 0.0d+0 1.0d*0 0.0d*0 0.0d*0 

v 9.2d-4 -6.8d+8 

l.2d+l -7.2d-2 
3.8d-1 3.2d+§ 

O.Od+O O.Od+i 
O.Od+O 0.0d+8 

0.0d*0 5.7d + 1 O.Od + O 0.0d*0 0.0d*0 

O.Od+O 0.0d+0 5.7d+l O.Od+O 0.0d+0 
-6.3d-3 -3.6d-2 7.5d-1 5.1 d-8 0.0d*0 

O.Od+O 0.0d+8 
O.Od+O O.Od+O 
6.2d-2 1.M-I 


Figure 11: State-space matrices. 
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Figure 12: Lateral directional flight control with uncertainties at the controller output 
and plant input. 
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Figure 13: MSM at the plant input and controller output. 


structured stability margin 2£ mi . This results from the one-sided structured mul- 
tiplicative uncertainty that is not accounted for in the computation (i.e. nonlinear 
elements of actuators and gain reduction tolerance at the controller outputs). To 
properly account for the sign of the uncertainty and its structural information, the 
one-Bided MSM was computed and it is shown to have a better robustness measure. 
K mi has the minimum value of 0.677, indicating that the system can simultaneously 
tolerate at least a 67.7 percent reduction in the effectiveness of all control surfaces and 
the actuator inputs up to at least three times the saturation value without instability. 


4.2 Example 2 

The MSM’s minimal value K m k also can be used to quantify a control systems 
tolerance of simultaneous gain and phase variations at all the plant inputs and out- 
puts. This is done so the system has good stability robustness with respect to the 
uncertainties at the two actuator commands and the three sensor outputs (see figure 
14). These uncertainties come from various sources. Model accuracy deteriorates at 
higher frequencies due to unmodeled aeroservoelastic effect. Several potential error 
sources exist within the assumed perfect sensors. Model reduction of the actuators 
can also be considered as one of the effects at the plant inputs. 

Shown in figure 15 is the Bode plot of the MSM vs. frequency. The minimal 
value, denoted K m2 = .4196, gives an indication of the minimal size of structured 
perturbations required to destabilize the system or equivalently diagonal perturbation 
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Figure 15: MSM at the plant input and output. 
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as large as 41.96 percent can be tolerated at any frequency; at higher frequencies, 
perturbation magnitude as large as u;/10 can be tolerated. 

5 Conclusion 

Computer algorithms for determining the multivariable stability margin u K m ” have 
been developed. The algorithms provide a reliable tool for evaluating the robust- 
ness of control systems with significant gain and/or parameter uncertainties. The 
computation for the one-sided and two-sided structured stability margin were done 
using nondifferentiable optimization theory. Robustness analysis was performed on a 
typical lateral directional flight control system problem with large uncertainty. 
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Analysis for Real Parametric Uncertainty* 

Athanasios Sideris 
Dept, of Electrical Engineering 
California Institute of Technology 


Abstract 

This paper has a twofold purpose. First, to review some key results in 
the literature in the area of robustness analysis for linear feedback systems 
with structured model uncertainty, and secondly to present some new 
results. 

Model uncertainty is described as a combination of real uncertain 
parameters and norm bounded unmodeled dynamics. We will mainly focus 
on the case of parametric uncertainty. An elementary and unified 
derivation of the celebrated theorem of Kharitonov and the Edge Theorem 
will be presented. Next, an algorithmic approach for robustness analysis in 
the cases of multilinear and polynomic parametric uncertainty (i.e. the 
closed loop characteristic polynomial depends multilinearly and 
polynomially respectively on the parameters) is given. The latter cases are 
most important from practical considerations. 

Some novel modifications in this algorithm which result in a procedure 
of polynomial time behavior in the number of uncertain parameters will be 
outlined. Finally, we show how the more general problem of robustness 
analysis for combined parametric and dynamic (i.e. unmodeled dynamics) 
uncertainty can be reduced to the case of polynomic parametric 
uncertainty, and thus be solved by means of our algorithm. 


*To be presented at the 3rd Annual Conference on Aerospace 
Computational Control, Oxnard CA, August 1989. 
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Computational Issues in the Analysis of Adaptive Control Systems 

Robert L. Kosut 
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Abstract 

Adaptive systems under slow parameter adaptation can be analyzed by 
the method of averaging. This provides a means to assess stability (and 
instability) properties of most adaptive systems, either continuous-time or 
(more importantly for practice) discrete-time, as well as providing an 
estimate of the region of attraction. Although the method of averaging is 
conceptually straightforward, even simple examples are well beyond hand 
calculations. Specific software tools are proposed which can provide the 
basis for user-friendly environment to perform the necessary computations 
involved in the averaging analysis. 
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Eiperimental Experience with Flexible Structures 

Gary J. Balas 

California Institute of Technology 


Abstract 

This paper will focus on a flexible structure experiment developed at 
the California Institute of Technology. The main thrust of the experiment is 
to address the identification and robust control issues associated with large 
space structures by capturing their characteristics in the laboratory. The 
design, modeling, identification and control objectives will be discussed 
within this paper. Also, the subject of uncertainty in structural plant 
models and the frequency shaping of performance objectives will be 
expounded upon. Theoretical and experimental results of control laws 
designed using the identified model and uncertainty descriptions will be 
presented. 
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A DISTURBANCE BASED CONTROL/STRUCTURE DESIGN ALGORITHM 

Mark D. McLaren 1 
Gary L. Slater 2 

Department of Aerospace Engineering and Engineering Mechanics 
ML # 70, University of Cincinnati, Cincinnati, OH 45221 

1 Introduction 

In the past, the structure and its control system have been designed independently. Structural design and 
optimization, and control system design and optimization, have each been areas of separate research, each 
progressing vigorously along its own path. However, spurred on by recent proposals of new, large, highly 
constrained space structures, the question has arisen as to whether an integrated structural/control design 
procedure might not be more appropriate. The first papers actively investigating the question of simultaneous 
structure and control design began appearing in the literature around 1983 [1,2,3]. Since then, there has been 
a growing interest in this subject from other authors, although the field itself is still in relative infancy. Using 
a conventional design approach for a controlled structure, one would first optimize the structure alone, then 
design a control system for this baseline structure. This process may then be iterated until both the structure 
and control system meet necessary constraints and objectives. 

Some authors ([4, 5, 6, 7], for example) take a “classical” approach to the simultaneous structure/control 
optimization by attempting to simultaneously minimize the weighted sum of the total mass and a quadratic 
form, subject to all of the structural and control constraints. In this paper, the optimization will be based 
on the dynamic response of a structure to an external unknown stochastic disturbance environment [8]. Such 
a “response to excitation approach” is common to both the structural and control design phases, and hence 
represents a more natural control/structure optimization strategy than relying on artificial and vague control 
penalties. The design objective is to find the structure and controller of minimum mass such that all the 
prescribed constraints are satisfied. 

Two alternative solution algorithms will be presented which have been applied to this problem. Each 
algorithm handles the optimization strategy and the imposition of the nonlinear constraints in a different 
manner. Two controller methodologies, and their effect on the solution algorithm, will be considered. These 
are full state feedback and direct output feedback, although the problem formulation is not restricted solely to 
these forms of controller. In fact, although full state feedback is a popular choice among researchers in this field 
(for resons that will become apparent), its practical application is severely limited. The controller/structure 
interaction is inserted by the imposition of appropriate closed-loop constraints, such as closed-loop output 
response and control effort constraints. Numerical results will be obtained for a representative flexible structure 
model to illustrate the effectiveness of the solution algorithms. 

2 General Problem Formulation 

The integrated control/structure design optimization problem can be stated as follows; find the vector of 
structural and controller parameters that minimizes the mass of the structure subject to a set of prescribed 
stochastic disturbances, with limitations on the available control energy and on a set of allowable output 
responses. This can be written in the form of a nonlinear mathematical programming problem as 
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Minimize, with respect to p, the weight J(p), subject to 
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is an A^-vector of design variables, 

is an m - vector of structural constraints, 

is an n - vector of state variables, 

is an n u -vector of control forces, 

is an Tiy, -vector of stochastic disturbances, 

is the (n x n) matrix containing the system dynamics, 

is the (n x n u ) matrix containing information on the 

locations and orientations of the actuators, 

is the (n x n^) matrix containing information on the points 

of application and orientation of the disturbances, 

is the t ih control effort constraint cost function, 

is the n Ut -order partition of u representing the control 

forces involved in g cc% , 

is the maximum allowable value of the i th expected control 
effort function 

is an (n U| x n Ut ) control force weighting matrix, 

is the i ih output response constraint cost function, 

is the maximum allowable value of the i th expected output 

response function E[yJ W{y d ], 

is an (n<*, x n^,) output response weighting matrix, 

is the i ih design output n^-vector, 

is an (na, x n) matrix giving the relationship between 

the state variables and y dt , 

is an N - vector of minimum design variable values, and 
is an N - vector of maximum design variable values. 


(i) 


The side constraints are the strict bounds p t and p u on the design variables, and are vector inequalities 
that are imposed element by element. These design variable bounds are not included explicitly as constraints 
in the problem formulation. Note that the structural weight and the structural constraints g will in general 
not be functions of the controller design variables unless the controller mass is included in the design. Note 
also that g CCt is a weighted mean square control effort, and g OCt is a weighted mean square output response. 
Multiple output response constraints are allowed, although only one of these will in general be active at the 
optimum design. However, all of the control effort constraints will generally be active at the optimum design. 

In this work w is a zero mean Gaussian white noise disturbance with covariance X w . The structure will 
respond to this disturbance with some transient behaviour, in addition to a steady-state response. It seems 
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reasonable to optimize the structure for the steady-state response to the disturbance rather than the transient 
response because the transient behaviour will normally be of secondary importance to the response objectives 
(such as long term pointing accuracy). Also, for steady state optimization, the differential equation constraint 
(state equation) can be replaced with a steady state covariance equation, so that the control effort and output 
response constraints may be recast in terms of this covariance. Therefore the two-point boundary value problem 
is eliminated and the numerical solution of the problem is significantly simplified. 

2.1 Full State Feedback Control 

The simplest form of feedback control is to feedback the entire state vector, with u = — Jyx, where K is the 
(n u x n) state feedback gain matrix. The controller design variables for this case will be the n u n elements 
of K. Substituting this control into the state equation, and assuming that the disturbance w is zero mean 
Gaussian white noise, the state covariance matrix X for this case can be found from the Lyapunov equation 

F c iX + XFj x + G w X w Gl = 0 (2) 

where F c i = (F — GK) is the stable closed-loop dynamical matrix for the full state feedback case, X = E[xx T ] 
is the (n x n) symmetric state covariance matrix, and X w = E[ti?iu r ] is the (n^ x n w ) symmetric covariance 
matrix for the stochastic disturbances. 

Expressions for the controller constraints in terms of this covariance matrix can then be obtained as 
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for i = 1, . . . , np 
for i = 1, . . . ,n a 


( 3 ) 

( 4 ) 


where Ki is the (n U| x n) partition of K corresponding to Uj. It is assumed that the u, are independent, and 
that u and K are ordered as 


U T = [ «r ••• <], K T = [ Kf Kj ... KZ,) (5) 

Note that n «» = and that columns of G can be interchanged to force condition (5) to be satisfied. 
Using full state feedback, the first-order necessary (Kuhn- Tucker) conditions for optimality can be analytically 
solved to give [8] 


K = R~ l G T A*, where F T A* + A X F - K x GR~ l G T A* + W = 0 (6) 

and where R and W are respectively the (n u x n u ) and (n x n) matrices defined as 

« = diag{(^)^}, W = HjWiH*. (7) 

The variables A U( and X yt come from the Kuhn-Tucker conditions, and are the Lagrange multipliers associated 
with the i th control effort and output response constraints respectively. 

Equations (6) define the solution to the optimal control problem 

f°0 

min J c = / [x T Wx + u T Ru] dt (8) 

Jo 

where K is the optimal steady-state gain matrix, and A* is the steady-state solution to the associated Riccati 
equation. Although this LQR property only holds true at the optimum point, it is computationally convenient 
to assume that at every point in the design cycle, the control design variables will be found as the solution to the 
optimal control problem (8). Therefore, the numerical optimization problem can be reduced to optimization 
over just the structural design variables, along with an optimal control problem solution which will be a function 
of the Lagrange multiplier vectors A u and A y . The immediate benefit of this is a reduced dimensionality 
nonlinear programming problem. In addition, since the regulator solutions always give a stable closed-loop 
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system, no explicit check must be performed on the system stability during the solution procedure. The LQR 
assumption in this proble formulation is similar to the approach taken in [9,10], and others, where R and W 
are fixed, and not chosen to satisfy the constraints. 

2.2 Direct Output Feedback Control 

For most real systems, the state vector will be very large, and the use of full state feedback would result in a 
controller of unacceptably high dimension, assuming additionally that the entire state is available. However, 
usually only a small subset of the system states will be available to the designer, in the form of the output 
measurement vector. These can include actual system states along with linear combinations of the system 
states, in the form y — Hx , where y is the n^-vector of outputs and H is the (n y x n) output matrix giving the 
relationship between the outputs and the system states. If these output states are to be used in the feedback 
loop, the resulting control is termed direct output feedback, with the control forces defined to be u = — Ky, 
where K is the (n u x n y ) output feedback gain matrix. In this case, the controller design variables will be the 
rt u n y elements of K . 

Substituting this control into the state equation, the state covariance matrix X for this case can be found 
as the solution to the same Lyapunov equation (2), except that now the closed-loop dynamics are given by 
F c i — (F — GKH). Note that since K does not satisfy any special conditions (such as the LQR conditions), 
the closed-loop system F c i is not guaranteed to be stable for any K. If F c i is unstable at any stage in the 
solution procedure, the covariance matrix X cannot be found from equation (2). Therefore, for this case, hard 
constraints on the closed-loop system eigenvalues must be imposed at every step in the design proceedure. 
“Hard” in this sense means that special precautions must be taken in the solution procedure such that these 
constraints can never be violated. 

Expressions for the controller constraints for direct output feedback in terms of the covariance matrix can 
then be found to be 
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3 Solution Algorithms 

In the general problem formulation presented in the previous section, the constraint functions are generally 
highly nonlinear implicit functions of the design variables. Solution of this problem could be attempted by the 
direct application of nonlinear programming techniques; that is, using the exact functional expressions for the 
constraints. However, this approach quickly becomes computationally very expensive as the dimensionality 
increases since the full objective and constraint functions must be evaluated at every step, and their respective 
gradients at most, if not all, steps thoughout the design procedure. Such evaluations tend to be computationally 
very expensive. 

Approximation techniques, where the implicit nonlinear problem is replaced by a sequence of explicit 
approximate (although not necessarily linear) problems, have been shown to yield efficient and powerful al- 
gorithms for structural design optimization (see, for example, [11,12]). In this paper, two solution techniques 
based on approximation techniques will be tested on the integrated control/structure design optimization prob- 
lem. The methods will be compared with respect to the ease of use, generality of application, and numerical 
robustness to changes in move-limits and other solution parameters. 


3.1 Sequential Nonlinear Approximations 

In this method, the fully constrained nonlinear optimization problem is solved by the iterative construction 
and numerical solution of a sequence of explicit approximate problems. The approximate problems are first- 
order Taylor’s series expansions (with respect to either the inverse design variables ([13], for example), or with 
respect to hybrid design variables [14]) of the objective and constraint functions. Depending on the intermediate 
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variables chosen, the approximate functions may still be nonlinear functions of the design variables. Therefore, 
the numerical solution is accomplished using a mathematical programming code, specifically the modified 
method of feasible directions as implemented in ADS [15]. 

The solution process begins with some initial structure, which is analyzed using the finite element technique. 
At this point, the gradients of the active constraint set are evaluated, and the approximate problem is formed, 
with respect to the current design. Expressions for the gradients of all constraints considered can be evaluated 
analytically. The approximate problem is solved with ADS using an active constraint set strategy to reduce 
the dimensionality of the approximate problem by deleting the inactive constraints. Move-limits on the design 
variables are imposed during the solution to ensure that the design remains within the region for which the 
approximation functions are of acceptable quality. The choice of move-limits and how they change can have a 
significant effect on convergence, and will often be determined from numerical experience with the particular 
problem at hand. 

After the solution of the approximate problem, the structure and its control system are deemed optimal 
if a convergence test on either the absolute or relative objective function change over a specified number of 
successive global iterations is satisfied. Otherwise, the objective and active constraint gradients are evaluated 
for the new design, a new approximate problem formed, and the process above is repeated in an iterative 
manner. The solution procedure ends when the design variables converge, or when the number of iterations 
exceeds some preset maximum. 

Scaling the structure and controller to the closest constraint surface may be possible in some cases, because 
of the special assumed form of the controller. Scaling to structural constraints has been performed in other 
work (see [16]) and will not be covered here. If full state feedback is used, it is possible and practical to scale 
the structure to the closest control effort constraint and closest output response constraint simultaneously. The 
variables with which the structure is scaled are the structural design variables (elemental areas or thicknesses), 
and the Lagrange multipliers associated with the two controller constraints A w and A y (where for clarity, and 
without loss of generality, the subscripts on the A*s that refer to the particular control effort or output response 
constraints under consideration have been dropped). 

Note that changing the values of A u and A y cannot independently change the values of u m9 = tr(K T RKX) 
and y ms = tr(H J WHdX), because in the LQR problem, only the ratio of A u to A y is important. One can 
choose the ratio (A u /A y ) to satisfy one of the control constraints — say u ma . Then y ms will not in general be 
satisfied. Suppose y ms is too large (i.e. y ms > a 2 ) at the particular point where u ms is satisfied. Then the only 
way one can satisfy the y mf constraint is to increase the sizes of at least some of the structural members. This 
seems reasonable because if the control constraints could be satisfied by simply choosing appropriate controller 
parameters, then there would be no interaction between structural optimization and controller optimization. 
Intuitively, it can be seen that this is not the case. Note that each member of the structure will be scaled 
by the same amount to fulfill our goals. Obviously, this method is not absolutely mandated, and some other 
approach could be used where the design variables are not scaled equally. However, this would then be resizing 
rather than scaling , a process normally left to the nonlinear programming algorithm. 

The final scaling aim is to set u ms = /? 2 and y mt = a 2 . To perform the scaling, it is assumed that, at 
iteration t, the values (u m ,)j and (y ma ), will change, as a result of changes to (A u ) t and the (p ; );, according 
to the equations 


{ u m$ )i+l 
( w mi )i 


= A a * 6 b ' , 

»+i i+i* 


where a*, 6*, c* and d, are constants, and where 
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If initial (educated) guesses for these constants can be made, they can be updated in an adaptive manner 
during the scaling proceedure. 

Move-limits are imposed on the design variables during each approximate problem solution. This is done 
in an attempt to restrain the design variables to a region in which the explicit function approximations remain 
reasonably accurate. However, deciding how to impose these move-limits is a non-trivial task. The local curva- 
ture of the design space (i.e. how nonlinear are the actual constraint surfaces in the region about the expansion 
point of the approximations) will determine the move-limits, with more strict move-limits applied in regions 
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of high curvature, and less strict move-limits imposed in regions of low curvature. Since second-derivative 
information is required to estimate curvatures, and since such evaluations are very expensive computationally, 
imposing move-limits is usually reduced to an art based on past experience. Quasi-Newton methods obtain the 
second derivatives using only first derivative information, however, these methods typically take N iterations 
to fill the Hessian, and can be very costly if N is large. 

For the purpose of this work, a move-limits factor 7 is imposed in an exponential form. If the current design 
variable and approximation expansion vector is p, then the upper and lower bounds on the design variables 
for the current approximate problem are defined as 


P„ = TP- Pi = “P ( l3 ) 

where 7 > 1. The limits specified in equation (13) must be imposed element by element. Note that since 
the design variables in this example will be structural design variables only, they are restricted to be positive. 
Obviously, equation (13) must be modified if the design variables can be negative. The exponential form of 
the move-limit factor is defined by the particular choice of 7 mifl and 7 mar (typically 1.2 and 1000 respectively 
in this work). 

3.2 Continuation and Sequential Linear Programming 

The complex nature of the constraint functions in the nonlinear optimization problem, especially the con- 
troller constraints, leads to various convergence problems in the context of a classical gradient based nonlinear 
programming code such as ADS. As the problem dimensionality increases, convergence will usualy become 
increasingly difficult to accomplish, as step sizes reduce to satisfy the local linearity assumptions inherent in 
gradient based solution techniques. Another method for the solution of mathematical programming problems 
that has recently become popular is the use of continuation methods to impose nonlinear constraints cou- 
pled with sequential linear programming (SLP) [17,18]. The continuation procedure is a conceptually simple 
method of applying restrictive constraints gradually from less restrictive ones, which replaces the most de- 
manding constraint functions of the form g(p) < 0, by a set of neighbouring constraint functions Q i} defined 

by 


^i(Pi.Ti) = fl(P.) - (! - T.)i/(Po) < 0 for* = 0,...,M (14) 

where p 0 is the arbitrarily chosen initial design point, and jj is a continuation parameter satisfying 

0 = 7o<7i <"-<7„ = l (15) 

Note that for 70 = 0, when p = p 0 , the new constraint function Q 0 is identically satisfied. If convergence 
is acheived for 7 = 1, then the original constraints will be recovered in M steps. The step size A7 = 7, — 7*— 1 
(and hence M), can be chosen small enough so that assumptions on local linearity can be almost arbitrarily 
satisfied. 

Linear Programming (LP) methods are a powerful approach to handling a large number of locally linear 
constraints, and due to the wide availability of very efficient LP codes, are an attractive alternative to nonlinear 
programming methods. The neighbouring problems generated by the continuation procedure can be written 
in the form 


Minimize J(pJ, subject to C$(Pi,7i) < 0 (16) 

To transform these equations into a linear programming problem, the equations are linearized about the current 
point Pi, and move-limits on the maximum parameter changes allowable locally are imposed. Expanding the 
objective and constraint equations in (16) to first order in a Taylor’s series expansion about p { gives the locally 
linearized problems in linear programming form as 
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Minimize, with respect to A p if AJj = 



A p t , subject to 


dp 




A Pi + g(Pi) - (i - 7.)g(p 0 ) < o 


(17) 


— c < A p t < c 


for * — 0,. . . ,(M - 1). All elements of e are assumed positive, and the vector inequality is imposed element 
by element. 

The algorithm begins with an initial structure, which is analyzed using the finite element technique. The 
initial problem (70 = 0) is solved, and the continuation parameter 7 is incremented from 7 0 = 0 by Ay to 
71 . Note that if initially K = 0, so that the control effort allowed for the initial local problem is zero, this 
initial problem becomes a pure structural optimization subject to the dynamic output repsonse constraints. 
The increment A7 is set by an a priori choice of Af, the number of continuation steps, although A7 need not 
be constant throughout the solution procedure. Successful implementation of the continuation method has 
been reported when A7 was chosen as initially quite small and increased to a larger value during the solution 
[18]. The choice of A7 is closely coupled with the choice of the nominal design variable move limit vector c 0 - 
There is in fact a tradeoff between the satisfaction of the local linearity assumption through c, and the ability 
to converge to the neighbouring problem through A7. Usually, for each particular problem, some numerical 
trial and error will be required to find those values of e 0 and A7 that yield an efficient solution technique. 

The gradients of the objective and constraint functions are calculated at the current design, and then the 
associated linear programming problem (17) is solved by a linear programming code. In this work, the linear 
programming routine E04MBF from NAGLIB (National Algorithms Group LIBrary) was used, although other 
routines inserted at this point should provide the same solution. Since the local linearity assumption will 
never be exactly satisfied, the actual constraint values at the new point, specified by the solution to the linear 
programming problem, will be different than that predicted. Therefore, the constraints Q { may not be satisfied 
following the linear programming solution step. 

Since a converged subproblem solution is required before increasing the continuation parameter, this local 
problem is iterated locally until convergence is obtained. At each local iteration, new gradients are calculated, 
and the move limits on the design variables are reduced so that c — cc 0 , where a value of c = 0.75 was used 
in this work. If convergence to the local problem does not occur within 15 iterations (where move limits are 
about 1% of their nominal values), the move limits are increased to their nominal values e 0} and the local 
iterations are repeated. Numerical experience with this algorithm has shown that this procedure is flexible 
enough numerically so that converged subproblem solutions can be obtained in a reasonable number of local 
iterations, as long as the neighbouring problems are “close enough”. Practically, this means that either the 
constraint values of the initial system should be “close” to their final desired values, or that M should be 
large. Once the local problem has been solved, the continuation parameter 7 is incremented, and the new local 
problem solved as before. At the M xh continuation step, 7=1 and the original problem is recovered, so that 
the solution to the M ih local problem is the solution to the original problem. 

If closed- loop stability constraints are violated at any stage in the solution procedure, these must be 
imposed immediately. To achieve this, it is possible to employ a method that never requires the calculation of 
the closed-loop eigenvalue derivatives, saving considerable computational expense. Of course, if in addition to 
overall stability there are constraints on closed-loop damping ratios or bandwidth, then the evaluation of the 
closed-loop eigenvalue derivative may be necessary at some point. The method used in this paper is to simply 
bisect A p i and perform another analysis until a stable system configuration is obtained. 


4 Gradient Analysis 

For the numerical optimization procedure to be practical, especially as the dimensionality increases to realistic 
structures, it is essential that it be possible to evaluate the first-order sensitivities of the complex constraint 
functions in an efficient manner. The objective function (the weight) is a linear function of the finite element 
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thicknesses and/or cross sectional areas (for truss type finite elements), so that its gradient is easy to calculate 
at any point in the design space. 

4.1 Gradients for the case of Full State Feedback Control 

The gradients of the controller constraints with respect to a structural design variable pj are given by 


dp 
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dpj 
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where Vi , <2,, and Hj are evaluated using the following set of equations: 
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Note that gradients with respect to controller design variables need not be evaluated since these design variables 
were effectively removed from the optimization problem by the LQR constraint. 


4.2 Gradients for the case of Direct Output Feedback Control 

The gradients of the controller constraints with respect to a structural design variable pj are given by 


^ = ^ttU^KTRiKiH + ^KTRiK—^X + y^j 

dpj Pi LV dpj °PjJ 
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where y,, Z it and Mj are evaluated using the following set of equations: 


(27) 

(28) 
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The gradients of the controller constraints with respect to the elements of the gain matrix K can be written 
in matrix expression form as 


dg C c t _ 
dK -ft 


[{RiKH -G T yi) XH t 


djloc, 

dK 


[G T ZiXH T ] 


[ d 8 1 ds 
— = — — and where 

dAi^ daij 

Hi = diag {0 • ■ 0 Ri 0 • • • 0 } (n , xn .) 

Note that the Ri in equation (35) is of order (n Uj x n Ui ), and is in the i th diagonal block of Ri. 


(33) 

(34) 


(35) 


5 Example: The DRAPER I Teterahedral Truss Structure 

The DRAPER I structure [19] is a tetrahedral truss attached to the ground by three right-angled bipods, as 
shown in Figure 1. Although attached to the ground, this model will act as a typical flexible structure pointing 
subsystem (e.g. antenna, radar, optical) attached to a rigid core. Any motion would then be with respect to 
this rigid core, and transmit forces to it. Consequently, this model has no rigid body degrees of freedom. The 
finite element model has 12 truss elements, since the joints are pinned and transmit no moments. There are 
four nodes that are free to move in all directions, so the model contains 12 degrees of freedom. The structural 
design variables are the cross-sectional areas of each of the 12 truss elements. Since there are 12 degrees of 
freedom in the model for this structure, the state-space model will be 24 th order. There will be 6 inputs 
corresponding to the 6 legs of the structure, and a varying number of outputs, depending on the problem at 
hand. 

For the purposes of this work, material parameters of p = 0.1 lb/in and E = Young’s Modulus — 20 kpsi 
were used. The dimensional values E and p were chosen to give initial numerical values of structural frequencies 
for the dimensional model roughly comparable to those of the non-dimensional model. The model contains 
no nonstructural mass. Elements 7 through 12, the three right-angled bipods, take on the duties of force 
actuators (and possibly colocated velocity and/or displacement sensors). Only one output response constraint 
is defined (n a = 1), with the design output vector y d representing the line-of-sight error of the top vertex [(ar , y ) 
displacements of vertex 1]. The disturbances, labelled uq and W 2 in Figure 2, are assumed to be independent, 
zero mean, Gaussian disturbances with intensity 1.0. 

The damping added to the state space system will depend on the state space realization used. For cases 
Where a realization based on physical variables is used, the damping matrix C is formed to be C = 0.1M 4- 
0.001R,, For cases where a realization based on modal variables was used, the damping ratio of each mode 
was specified to be 0.1% of the modal frequencies during the formation of the state matrices. The weighting 
matrices R and W are set to the identity matrices, so that equal weighting is given to all components of u and 
y d . The minimum cross-sectional areas for all elements was specified as 0.1 in 2 . For this problem, no static 
structural constraints were specified in this model of the DRAPER I structure, the intent being to investigate 
the effect of the closed-loop controller constraints on the structural design optimization. 


5.1 Full State Feedback Control 

In this section, the sequential nonlinear approximations solution algorithm, with the addition of the scaling 
procedure outlined in Section 3.1, is applied to the full state feedback control of the DRAPER I structure. 
The effect of the scaling procedure used here can be determined by applying the sequential approximations 
algorithm in the form of a direct output feedback problem with H = / with no scaling assumed. The 
continuation solution algorithm is also best handled in the form of a direct output feedback problem since 
no special scaling is assumed. Both solution algorithms applied to the case of full state feedback are discussed 
in Section 5.2, where direct output feedback control is considered. For brevity, only limited results for both 
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controller methodologies are presented in this paper, but a full discussion of this example can be found in 
reference [20]. 

Runs were made optimizing the DRAPER I structure using an inverse design variable approximation for 
all constraint functions. The initial structure was defined with all structural design variables set at 10 in , and 
with the Lagrange multipliers A u and Ay set at 1.0. This set of initial conditions will be termed the symmetric 
set of initial conditions, for they specify a structure with a number of vibrational modes of the same frequency 
(repeated eigenvalues). A range of allowable expected output response (a 2 ) of 1 x 10 -5 in 2 to 1 x 10 -4 in in 
steps of 1 x 10 -4 in 2 , and allowable expected control effort (/? 2 ) of 50 lb 2 to 80 lb in steps of 10 lb were used. 

Table 1 summarizes the resulting minimum weight in pounds found for the case where a state-space real- 
ization based on the modal displacements and velocities was used. Intuitively, two trends would be expected 
in the data displayed in Table 1. The optimum weight should decrease as the allowable control effort 0 2 is 
increased at constant allowable output response a 2 (left to right across the table), and the optimum weight 
should decrease as the allowable output response a 2 is increased at constant allowable control effort /? 2 (down 
the table). With reference to Table 1, we can see that this trend is observed in a macroscopic sense only, there 
being several examples where this trend is not observed. For example, considering the first column of Table 1, 
which corresponds to (3 2 = 50 lb 2 for varying a 2 , we see only two exceptions to the expected trends, these 
being at a 2 values of 6 x 10 -5 and 9 x 10~ 5 . Similar results are observed in all other columns and rows of 
Table 1. Results obtained using a state-space realization based on the physical displacements and velocities of 
nodal points are more consistent than when using the modal variables, although still not totally uniform. It 
might be pointed out that the results when using a physical realization were consistently easier to obtain, there 
being no need to alter the nominal value of j min to obtain convergence, and the number of global iterations 
required for convergence being consistenty lower. 

Some understanding of these contradictory results can be found by considering Table 2, which gives the 
optimal element areas found for (3 2 = 50 and for the varying a 2 corresponding to the first column of Table 1. 
Also given in this table is the number of global iterations required for convergence, the final values of the 
Lagrange multipliers (which then defines the LQR controller), and the initial value of the structural design 
variables (all the same for the symmetric set of initial conditions) at which the initial scaled system satisfies the 
constraints. Immediately apparent from Table 2 is a number of seemingly separate regions of the design space 
into which this structure has converged. For example, the final designs for a 2 = 5 x 10 5 and a 2 = 7x10 
seem to be similar in relative structure. Here, “similar” refers to the relative sizing of the structural members, 
in that design variables that are “larger” in one design are “larger” in the other. Both these designs are 
however distinctly different from those for a 2 = 1 x 10 * and or 2 = 3 x 10 which themselves are similar. 
The conclusion seems to be that we are converging into different regions of the design space with our solution 
algorithm, and that there are numerous local minima. Several columns of Table 2 seem to define their own 
region of the design space, being dissimilar to any other column. In other words, our design space seems to 
have multidimensional corrugations leading to multiple local minima. The solutions will lie somewhere on the 
intersection hyperplane between the surface of constant allowable output response and the surface of constant 
allowable control effort. 

This corrugated nature of the design space can be illustrated by considering the solutions obtained, for the 
same constraint case, when starting from different initial conditions. For the case of d 2 = 75 and a 2 = 1 x 10 , 

Table 3 summarizes the results of runs made when modal state space realizations were used, and when only the 
initial conditions are varied. The different initial conditions are defined by setting all structural elements equal 
except the first (element 1), to which is added a percentage of the size of other elements. Even with this limited 
variation in the initial conditions, there are seemingly many distinct regions in the design space into which the 
system may converge. A picture of the constraint surfaces as a one-dimensional slice of the multidimensional 
space will emerge if these optimal structures are varied into each other in a linear fashion, and the constraint 
values are calculated between each case. That is, the structural design variables and Lagrange multipliers are 
changed linearly from the optimal values in one case to those in another case. Then the constraint surfaces 
obtained would be those seen when travelling in a straight line between each successive point. 

The results of such an analysis are shown in Figures 3 for the cases corresponding to those given in Table 3. 
As expected, the weight varies linearly between the cases, but it is the constraint curves that are much more 
revealing. For example, considering Figure 3, one can see that between case 1 and case 2, there is a ridge 
of output response larger than the maximum allowable value. Similarly, the control effort first decreases, then 
also increases to a ridge of high value. This corresponds to a hump in the constraint surfaces between the two 
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points in the design space. Assuming that we would see such behaviour when moving in every direction away 
from case 1 and case 2, rather than just in a direction between the two as shown in Figure 3, then the design 
points corresponding to these cases would represent local minima. In this situation, the design can become 
“trapped” in such a locally convex region, causing the solution algorithm to converge to different points. 

With reference to the same Figure 3, one can see that both the output response and control efforts are 
virtually constant between cases 2 and 3, while the weight increases slightly from 2053.0 lb to 2090.6 lb. 
This indicates that case 2 and case 3 actually represent the same optimal solution, with the difference being 
accounted for in the variance allowed by the convergence criteria used. The direction in the design space 
represented by the movement from case 2 to case 3 would lie in the intersection hyperplane of the surfaces of 
constant control effort and output response constraints, and would be at a shallow angle to the linear surface 
of constant weight. Figure 3 graphically illustrates a design space that is a very complicated function of the 
design variables, in which multiple local minima abound. 

There are some other tests that can be made on the hypothesis that the design is becoming trapped in 
local minima. If the design is actually trapped in a local minimum, the solution should stay in the vicinity of 
that minimum if the problem is changed only slightly. That is, if a converged solution is used as the initial 
conditions for an optimization run where the constraint objectives are changed by a “small” amount, then the 
new problem should converge to a point that is “close to” the initial point. Table 4 represents such a situation. 
Here, the solution was first obtained for the case where (3 2 — 50 and oP = 1 x 10 ^ , and where a modal state 
space realization and inverse design variable approximations were used. This converged solution was then used 
as the initial conditions for the cases (3 2 = 50 and a 2 = 2x 10 -5 , and (3 2 = 60 and a 2 =lx 10~ 5 . Moving 
down each column, and across the top row, of Table 4, the converged solution from the previous case was used 
as the initial condition for the new problem. As can be seen from Table 4, the two expected trends in the data, 
as mentioned previously, are now observed without exception. The optimal solutions for the first column of 
Table 4, corresponding to the cases where f3 2 = 50, are given in Table 5 The solutions now appear to be in 
the same local region of the design space, as evidenced by the relative sizing of the optimal structures. For 
example, note that in all converged designs, structural elements 9, 10, and 12 are at their lower gage limit of 
0.1 in 2 , and that the first structural element is the largest by far. The optimal solutions for the (3 2 = 60 cases 
from Table 4 also appear to be in this same region of the design space. These results test the hypothesis that 
designs are converging to local minima, and indicate that the local optima are real and not simply figments of 
a numerical imagination. 


5.2 Direct Output Feedback Control 

Recall that when using direct output feedback, no simplifying assumptions can be made regarding the controller 
design variables (elements of K), such as the LQR assumption used in the case of full state feedback. Therefore, 
no scaling of the structure and controller to the closest constraint surface is performed. If a full state feedback 
case is to be solved in the form of direct output feedback with H = 7, the number of design variables will 
increase significantly over the number of design variables created when other types of controllers are considered. 
This is because the number of states in the plant model will generally be large for anything but trivial systems, 
hence K will have many elements, all of which will be treated explicitly as design variables. However, such a 
situation is considered here to aid a comparison between the two solution algorithms, and the results obtained 
in the previous section. For the continuation algorithm, convergence was obtained after the specified number 
of global iterations, set by the specification of A7. Also note that, since only one each of the output response 
and control effort constraints are specified at this stage (n a = 1, np = 1), these can both be set as equality 
constraints without loss of generality since they will both be active at an optimum. 

Table 6 gives the optimal weights found using the sequential approximations solution algorithm in the case 
when a physical state space realization and inverse design variable approximations are used, for /? 2 = 50 and 

= 60, and for varying a 2 . Compared to the similar case when scaling was performed, the optimal weights 
found here are larger in every case. Additionally, the solution times were significantly larger because of the 
number of iterations required for convergence. Note that some values in Table 6 are for situations where an 
average steady state value was obtained, but where the design was jumping around too much over each global 
iteration for convergence to occur. This was so even though the move-limits for every case in Table 6 were 
set at a relatively small ±2.5%. A smaller move-limit would aid convergence, but slow it considerably. Also, 
smaller move-limits may cause premature convergence if the design is at a point where the constraint surfaces 
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and the surface of constant weight are nearly parallel. 

Table 7 gives the optimal weights found using the continuation algorithm for the same cases listed in Table 6 . 
All cases listed were obtained with the continuation parameter A 7 = 0.01 until 7 = 0.5, when A 7 became 0.02, 
so that for every case convergence was achieved in 75 global iterations. Also listed in Table 7 are the move- 
limits on the maximum parameter changes allowable locally (c in Section 3.2). These are set to give a tradeoff 
between the satisfaction of the local linearity assumption and convergence to the local neigbouring problem, 
and must be found by numerical experimentation. Note that in both situations represented by Tables 6 and 
7 , minimum move-limits on the elements of K are set so that these elements can change sign if desired. 

The solutions given in Tables 6 and 7 were obtained when the initial structure was defined with all truss 
elements of equal cross-sectional area. The particular values for this area are given in Tables 6 and 7 for 
each case, and were chosen so that the initial output response was “close to” its desired final value. With 
all structural elements at 120 in 2 , 90.0 in 2 , and 60.0 in 2 , the initial output responses were 2.067 x 10 , 

3.674 x 10 -5 , and 8.266 x 10 -5 respectively. The initial control effort was zero of course, since all elements of 
K were set initially to zero. 

Comparing the optimal weights from Tables 6 and 7, it can be seen that those obtained using the con- 
tinuation solution algorithm are significantly lower than those obtained using the sequential approximations 
solution algorithm in every case. The optimal weights are also much more consistent, in terms of the expected 
trends as a 2 increases, when using the continuation method. Convergence was obtained for every case listed 
in Table 7, whereas for three of the cases listed in Table 6 , no convergence was obtained within 300 iterations. 
The reason for the convergence failure can be illustrated by considering the convergence histories given in 
Figures 4 and 5, for typical cases from Tables 6 and 7 respectively. The inset in Figure 4 shows some of the 
histories toward the end of the solution in more detail. These can be seen to be very rough, as compared to 
the histories in Figure 5 which are smooth everywhere. These parameter oscillations for the sequential approx- 
imations solution technique are indicative of a move-limit set too high, so that the local linearity assumption 
is violated. However, since the convergence is very flat toward the end of the solutions, a smaller move-limit is 
likely to cause premature convergence, or to significantly slow down the convergence for very little additional 
objective reduction. 

The optimal weights found using the continuation method listed in Table 7 compare very favourably with 
those found for the same cases ( f3 2 = 50 and varying a 2 ) when the full state feedback LQR assumption was 
used to simplify solution. In almost every case, the optimal weight is approximately equal to or lower than 
those found earlier. However, the sequential approximations solution algorithm results are much worse, being 
significantly larger than the optimal weights found earlier in every case. The sequential approximations solution 
algorithm performs so poorly because it tends to become trapped in a local minimum close to the specified 
initial conditions. 

Consider now cases where the full state is not available for feedback. In the DRAPER I structure, the six 
right-angled bipods are usually assumed to take on the duties of force actuators and colocated rate sensors, so 
that H = G T (output state of dimension six). Tables 8 and 9 list the optimal weights found for this reduced- 
order output vector for the same cases used in Table 6 , when the sequential approximations and continuation 
solution algorithms respectively are used. Note that the optimal weights consistently obey the trends that are 
expected as the allowable output response and controller effort are altered. Even so, the designs can converge 
into completely different regions of the design space for any particular case. This is illustrated by Table 10, 
which gives the optimal structural design variables for the cases represented by the second column of Table 9. 
Note that every solution does not define a unique local minima. Many of the solutions seem to lie in the same 
local minima region, for example the solution for the case a 2 = 1,7, and 9xl0 -5 . Once again, the design space 
is found to have many local minima. 

The two solution algorithms here seem to predict quite similar optimal weights for the cases considered, 
although in general those found using the continuation method are slightly better. However, convergence for 
the results using the sequential approximations solution algorithm were more difficult to obtain than those for 
the continuation method. Altering the nominal move limits (set at ±2.5%), and perhaps reducing it toward 
the latter stages of each solution would aid convergence, but at increased effort on the part of the user. The 
continuation results are easier to obtain since they require less individualized attention. 

Comparing Tables 8 and 9 to Table 7, it can be seen that the optimal weights found when H = G are 
much larger than those found when full state feedback was used. The reason for this is the particular placement 
of the disturbance forces relative to the design output states (which determine the output response). For the 
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case considered here, the DRAPER I structure is disturbed at node 1, the displacement of which is to be 
kept below the design objective value a 2 . With full state feedback, the displacement states of node 1 are 
available for feedback, whereas if H — G T , these states are not available, and the effect of the displacement 
of node 1 is available only indirectly through its effect on the velocities along the six bipods that make up 
the sensors. The importance of these states can be seen by examining the optimal gain matrices for the full 
state feedback case. The largest gains associated with displacements and velocities in these matrices appear 
in the columns corresponding to node 1 degrees of freedom, indicating that the states associated with node 1 
are very important. When they are not available, the controller does not have as much information about the 
state of node 1, the node it is trying to control, as it does in the case of full state feedback, and will increase 
the structural stiffness (and hence mass) to compensate. 

To illustrate further, consider output feedback with the displacement and velocity states of node 1 added 
to the output vector used previously. The optimal weights for the same /? 2 = 50 and f3 2 = 60 cases used in 
Table 9 with this new output vector (of dimension 12 now) are given in Tables 11 and 12, when the sequential 
approximations and continuation solution algorithms are used respectively. There are now larger differences 
between the weights found using the two solution algorithms than in the case when H = G T only, with the 
continuation method giving the best results (with one exception). The weights obtained using the continuation 
algorithm are now very close to those obtained when using full state feedback (in Table 7), although still a 
little larger in every case. This is because even with this larger output vector, the displacement states for 
nodes two through four are still not used in the controller. 

All the results presented so far were generated with the external stochastic disturbance intensities set at 
one (X w = I). If these intensities are varied as X w = x w I } the effect of varying x w on the optimal weight is 
shown in Figure 6. These results were generated using the continuation solution algorithm, for the case when 
a 2 = 1 x 10~ 5 and ft 2 = 50, and when all structural design variables were initially set at 120 in 2 . As can be 
seen, the relationship between the disturbance intensity and the optimal structural weight appears basically 
linear in the range shown. However, a linear fit of the data does not produce an optimum weight of zero for 
a zero intensity disturbance, as would be expected. Therefore, the relationship cannot be exactly linear. The 
results also indicate that the disturbance level chosen for the previous results (i^ = 1) was significant for the 
range of output response and control effort constraint objectives used. 

As long as the neighbouring problems in the continuation algorithm were close enough, which was satisfied 
by starting from an initial point where the constraint values were “close to” their final desired locations, no 
difficulties were experienced in obtaining convergence. The solution times for the continuation algorithm were 
2-3 times longer than for solutions by the sequential approximations algorithm, since on average approximately 
8-12 local iterations were required for convergence to the neighbouring problem for each global iteration. The 
performance of the sequential approximations solution algorithm decreases as the dimensionality increases, as 
evidenced by the sequence of cases where H = G 7 (48 design variables), H = G 7 plus node 1 displacement 
and velocity states (84 design variables), and H = 1 (156 design variables). However, the continuation method 
seemed much less sensitive to the problem dimension. The continuation solution method was found to be 
generally superior, for this problem at least, to the sequential approximations solution method, with respect to 
the confidence in obtaining a “good” converged solution. The disparity in solution times was acceptable because 
of the ease with which solutions were obtained using the continuation method, and because the solutions found 
seemed to be generally much better than those found using the sequential approximations algorithm. 


6 Conclusions 

In this work, the integrated control /structure design optimization problem has been investigated from a re- 
sponse to disturbances point of view. Both full state and output feedback controllers were employed in the 
control strategy, and two solution methods were compared. It was found that for this problem, the continuation 
method coupled with a sequential linear programming approach performed better than the more traditional 
type of nonlinear approximations approach, in the sense that it was more robust to changes in the arbitrary 
parameters set by the user, obtained better results, and the results were easier to obtain. The design space was 
found to exhibit multiple local minima in which the solution could become trapped, although the continuation 
solution method seemed to handle the corrugated design space better than the other method. In future work, 
more diverse controller types must be considered, along with structures consisting of more complicated finite 
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elements than simple truss members. Additionally, more realistic problems of higher dimension must be solved, 
to demonstrate the practicality of this design procedure. 
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Figure 2: The disturbance model 


Figure 1: The DRAPER I Structure 
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Table 1: Optimal weight using a modal state-space realization, the 
symmetric set of initial conditions, and inverse design variable approx- 
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Using Deflation in the Pole Assignment Problem 
with Output Feedback 

George Miminis 

Department of Computer Science, Memorial University of Newfoundland 
St. John’s, Newfoundland, Canada AlC 5S7 

Abstract 

A direct algorithm is suggested for the computation of a linear output feedback for a multi 
input, multi output system such that the resultant closed-loop matrix has eigenvalues that include 
a specified set of eigenvalues. The algorithm uses deflation based on unitary similarity transfor- 
mations. Thus we hope the algorithm is numerically stable, however, this has not been proven as 
yet. 

1 Introduction 

Deflation is a technique that has been efficiently used in the solution of the standard eigenvalue 
problem of a matrix A as well as other eigenvalue related problems. According to this technique once 
an eigenpair (Ai,xi) of A is computed, we continue the process with a matrix that possesses only 
the remaining eigenvalues of A, and possibly a zero eigenvalue in the place of Ai. In this way we are 
left to solve a smaller problem. Deflation can be accomplished by a variety of algorithms. Some of 
them are, Hoteling’s deflation, Wielandt’s deflation, deflation based on similarity transformations, 
deflation by restriction, etc. An excellent review of the first three methods can be found in [7, pp. 
584-600], whereas deflation by restriction can be found in [5, p. 84]. Lately, deflation has been 
used in the solution of eigenassignment problems. For example, Wielandt’s deflation is used in [6] 
to solve the partial eigenvalue allocation problem with state feedback for continuous time systems. 
In this paper we will be concerned with deflation based on unitary similarity transformations, and 
how it can be used in the pole assignment problem with output feedback, or as it is often known, 
the eigenvalue allocation problem with output feedback (MEVAO). 

In section 2 we define the MEVAO. In section 3 we define transmission zeros and discuss how 
they affect the MEVAO. In section 4 we give an algorithm for the solution of the MEVAO. Finally 
in section 5 give some numerical examples to demonstrate the performance of our algorithm. 
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Lower case Greek letters represent scalars, upper case Roman represent matrices, while lower 
case Roman represent column vectors and indices. Superscripts T, H indicate transpose and con- 
jugate transpose respectively. The notation k = i(r)j means that k takes all the values starting 
from i until j with step r. e, represents the ith column of the identity matrix I, and A(-) the set of 
eigenvalues of a matrix, counting multiplicities. R(-), N(-) will represent the space spanned by the 
columns of a matrix, and the null space of a matrix respectively. JR,C will represent the set of real 
and complex numbers respectively. 

2 The Problem 

The MEVAO problem that we will attack in this paper may be defined as follows: 

We are given a controllable and observable system 

x(t) = Ax(t) + Bu(t) (1) 

1/(0 = Cx(t) (2) 

where A 6 JT Xn , B £ R nXm , C £ R pXn , rank (B)=m, rank(C)=p, x(t) £ R n is the state 
of the system at time t and u(t) the input at time t. We are also given a self conjugate set 
of eigenvalues A,-, t = l(l)r with r=min{n,m-(-p-l}, we need to compute K £ R mXp such that 
A(A - BKC) D {A, | i = l(l)r}. As a result, a linear output feedback u(t) = -Ky(t) may be 
computed that will make (1) become 

x(t) = (A - BKC)x(t). 

The resulting system will have desirable properties for the contror engineer. 

3 Transmission Zeros 

In the remaining of the paper unless otherwise stated, we will assume without loss of generality 
that m < p. 

Transmission zeros ( trzs ) play an important role in the MEVAO, thus it is vital that they are 
well understood. Our experience with the literature on trzs has been rather disappointing. In our 
search for a definition we have found several; some of them even contracting one another. Thus we 
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decided to resort to the physical motivation of trzs , and from that to derive a sensible definition. 
Some observations that may give some physical motivation to the concept of a transmission zero 
(of a controllable-observable system in our case) were found in [2], p.41. From this we concluded 
that, physically, a transmission zero ( trz ) of a system is a specific frequency of the system for which 
there exists a nonzero input that will yield a zero output. This basically is definition 1 below. In 
the following we will present three definitions of trzs , and we will prove their equivalence. The 
reason we present these three defintions is because the first is directly derived from the physical 
motivation of trzs , the second is very useful in matrix computations, and the third is frequently 
encountered in the literature. 

Definition 1: A number ( E C is a trz of (1),(2) if and only if there exists nonzero input u that 
can yield a zero output y = G'(C)^ = where 


G(s) = C(sl - A)~*B , with s£ C 


is the transfer function of (1),(2). □ 

The number of trzs of (1 ),(2) can be at most n — max{m,p} (see for example [2], p.65). The set of 
all trzs of (1),(2) will be denoted by Z tr . Often it is helpful to use the following definition of a trz. 
Definition 2: £ € C is a trz of (1),(2) if and only if 


3u ^ 0 : 


A - (I B 
C 0 


x 

u 


= 0 with x = (£/ — A) 1 Bu 


Frequently we encounter in the literature the following definition of trzs. 

I iiKL \ 

V'i(C) 


Definition 3: Let L-'iQGiQR-'iQ = 


V-m-l(C) 



v 0 7. 

be the Smith-McMillan (S-M) form of £?(£)> where L(£), #(C) are polynomial matrices and we have 
assumed that G(s) has normal rank m. Then ( is a trz of (1) if and only if there exists i = 1, • • • , m 
such that €i(C) = 0. O 

In the S-M form of G(() the following properties hold 
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«l(C) I *a(C) I — I ««(C) and ^(C)l^m-l(OI---IV’l(O- 


Because of the above properties we see that there cannot be trz £ that will also be a root of $ m (0* 
Since, if it were then € m (0 = 0 and ( would have been cancelled in e m (C) | $m(0- Hence ( would 
not be a trz , which is a contradiction. The roots of 0,(£), i = l(l)m are the poles of (1),(2), 
counting multiplicities. In our attempt to prove the equivalence of the above definitions we will 
be using the expression G(Qu with ( E Z tT . One then may wonder about the existance of G(Qu 
when C is also a pole of (1),(2). The following lemma will help us clarify this point. 

Lemma 1: Let £ E C be a trz as well as a pole of (1),(2), then there exists u / 0 such that 
lim G(s)u exists. 

S— C 

Proof: Let ( be a pole as well as a trz of (1), then from the S-M form of G(£) we have that 

3(iJ) with i < j : $t(0 = f j(0 = 0 • 

Let us now for illustration assume p = m = 3 and ^i(C) — £2(0 = 0* Then by taking u T = 
(0,77,0) / 0 we have the following from the S-M form of G(s). 

lim L~ 1 (s)G(s)R~ 1 (s)u — 
s— c 


/ lim 

V 

( o 
o 


lily, g ?(f) 


v 


01im / 

S — ► £ ^3 (^) / 


lim 


\ 

(o\ 


V 

/ 

[o J 


which is defined since $3(0 / 0* Since Z(s), R(s) are nonsingular (they have determinants inde- 
pendent of s, see for example [7], p.21), limi~ 1 (s), lim R~ l (s) are defined. 

s->C s-*>c 

Hence if we take u = lim R~ l (s)u then limG(s)u is defined for every £ E Zt T that is also a pole of 
s *->C 


(1),(2). 
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In the following when we write G(C)u with ( e Z tr also being a pole of (1), we will mean 

limG(s)u. Similarly when we write x = ((I - A)~ 1 Bu we will mean x = lim(s/ - A)~ 1 Bu, when 
®— *C s— ►( 

c € Ztr is also a pole of (1),(2). 

Theorem 1: Definitions 1,2 and 3 are equivalent. 

Proof: To prove that definition 1 is equivalent to definition 2 we proceed as follows: 


C € Ztr {3«^0 


{3 « ^ 0 
3 u ^ 0 


C(<;i - A)-'Bu = 0} 

Cx = 0 with x — (£/ — j4) _1 5u} 
A - a B \ ( X 
C 0 ) I u 


= 0 with x = (£/ - A) 1 Bu 


4=2 C € Z tT 


To prove the equivalence of definitions 1 and 3 we proceed as follows: 


C G Z tr 4=2 {3 « ^ 0 : G(C)u = 0} 

diag isKl 

3«/0 : 1(C) I \R(C)u = 0\ 


mo 


( 3 ) 


Since L(s), R(s) are nonsingular for Vs € C, M(Q must not have full column rank in (2). Hence 


( 3 ) 


{3 i G {1,2, • • • ,m} 




o(0 = o} 


□ 


In the last theorem we silently assumed that the normal rank q of G(s) is m, or equivalently 

1 A- si B 


the normal rank n + q of P(s) = 


0 


is n + m. However, theorem 1 holds even for 


the degenerate case where q < min{m,p} (= m according to our assumption). To show this we 
proceed as follows: 
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First we prove Z lT = C using definition 1. Since q < rn the following is true 


defi 


{V( 6 C 3u / 0 : G(C)u = 0} «=£ Z iT = C . 

To prove the equivalence of definitions 1 and 2 we see 


( 4 ) 


Z tT = C \ 


VC G c 3u/0 : 


A -(I B 
C 0 


x 

u 


= 0 with x = (C I — A) 1 Bu} 


To prove the equivalence of definitions 1 and 3 we see 


Ztr — C < 


VC G C 3u / 0 : 1(C) I 


i=l( 1 )m 
0 


R(()u = 0 


«=> {VC €C 3* e m} : e,-(C) = 0}. 


Hence another definition of trzs that is often found in literature, and according to which C G Zt r 
if and only if G(C) and P(C) go below their normal rank, does not agree in the degenerate case 
with definitions 1, 2 and 3. This is so because not VC G C will make G(C) and P(C) go below their 
normal rank, hence Z tz / C , according to this definition. In what follows we will assume that 
trzs are defined by definitions 1, 2 or 3. 


Definition 4: A number C € C is a strong transmission zero ( strz ) of (1),(2) if and only if 
G(C) = 0, or equivalently fi(C) = 0 (in the S — M form of G(C)), or equivalently 

^ =0 , with x = (C I — A)~ 1 Bu > . 

□ 


Vu / 0 , 



Theorem 2: ( 6 C cannot be assigned by output feedback if and only if C is a strong transmission 
zero. 

Proof: Let G(C) / 0, then if C G A(A) we take K = 0 and C is placed. If C £ K A ) then there 
exists u / 0 such that 


y = G(C)« # o 


=► < 


x = (C/ - A)~ x Bu 
y = Cx ^ 0 


> • 
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[Cl -(A- BKC )] X 


= (C/ - A)x + BKCx 
= Bu-B^fy = 0 

y I y 

Thus the eigenpair (£,®) has been placed, which proves the necessity of the propositionn given by 
the theorem. To prove its sufficiency assume G(() = 0, but nevertheless ( has been placed. Then 
there exists x / 0 such that for some K 


(A - C I)x - BKCx = 0 => x - (A - CI)~ l BKCx = 0 

=> Cx- C{ A - C/) -1 BKCx = 0 
=> Cx + G(QKCx = 0 
=> Cx - 0 . 

Hence from ( A — C I)x — BKCx = 0 =$■ {A — CO® = 0 =$■ ( £ A(A). This is a contradiction, since 
if C G A(A) the G(C) is not defined, however it is assumed that G(C) = 0. □ 


4 The Algorithm 


We give an algorithm based on deflation by unitary similarity transformations. 

Step 1: Compute a feasible eigenvector xi of At - B\K\C\ = A - BKC corresponding to, say 
Aj with || xj H 2 = 1- We will show in the sequel how we may compute such an Xj. 

Step 2: Allocate the eigenpair (Ax,®x) to Ai — B\K\C\ as follows: Let Vj be a unitary transfor- 
mation such that Vj^CiXj = ^xex and B\ = ( Uq,Ui ) ( ^ a QR decomposition of B\, then 


(Ai — B\K\Ci)x\ = Ai®i 


A lXl - B l (K l V 1 )(V 1 fI C 1 x 1 ) = Am 
B\K\6ie\ = (Ai — Ax/)®! , with K\ — K\V\ 
B\ki6i = (A\ — Ax/)®x , with k\ — k\e\ 

{ *' ) M , = - Ai/)xi \ 

\ 0 ) ^ U 1 H (A 1 - Ax/)®! ) 


( 5 ) 


From (5) we see that if x\ was computed in step 1 so that X[ E — Aj/)], and k\ is computed 

by solving 

* 1*1 =6; 1 U"(A l -\ l I)x l ( 6 ) 

then (5) is satisfied and the eigenpair (Ai,xi) is allocated. An eigenvector X\ that satisfies 
X\ E N[?7 1 t (Ai — Ai/)] will be called feasible. 

Step 3: Compute unitary Q\ = (x\,Q\). Hence from Q\t\ — X\ Q\ X\ = e\ we see that Q\ can 
be a Householder transformation or a sequence of rotations. 

Step 4: Perform the unitary similarity transformation 


Q?(A l -B 1 K 1 C l )Qi 


Q? 

Q? 


[Ai-B^iV^C^xuQi) 


[( A lXl - BtktSi) , (AiQi - B x K x (y»C x )Q x )) (7) 


From step 3, k\ has been computed so that A\X\ — B\kiS\ = \\Xi. If we now set K\ = 
and 


VfCiQi = 


( 7 ) = 



then 


' [Am , A 1 Q 1 -B 1 (k l e? + K i C t )) 
{<*) 


/> 

x?[AiQi - B^kxc? + K 2 C 2 )] 


Q\AiQi - Q”B x k x c” - Q?B x K 2 C 2 

f A! 

* ) 


a 2 -b 2 k 2 c 2 j 


where A 2 = Q x AiQi - Q?BxkiC? , B 2 = Q? B 2 . 


Step 5: Continue the allocations with A 2 - i? 2 ^ 2 C 2 . □ 

There are two points that need to be clarified for the allocation of Ai in the above algorithm. One 
is, how to identify whether Ai is a strz or not, and the other is the computation of x\. 
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Lemma 2: If {Ai is a strz of (1) } =$■ 6 X = 0. 
Proof: Let Ax be a strz of (1) then 


r 

< ^ 0 3x\ : 

We may note however that the above lemma on its own is not adequate to identify strz . In the 
sequel we will compute our feasible eigenvector x\ in such a way that if 6% = 0 then Ai will be a 
strz . As we will show, computing x\ in this way will also improve the numerical stability of our 
algorithm. To accomplish the above we proceed as follows: 

We know that 

I *1 1 = 11 C lXl || 2 => {6 X = 0 x x e N(Cx)} . 

Thus if we compute X\ so that 

*i€ KWi H (A l -\ l I)]-N(C 1 ) 

when 

~ W] ~ N(Cx) / {} 

then we may safely conclude that — 0 => {Ai is a strz }. If the dimension of 

N [U?{A X - A t /)] - N(Ci) 

is greater than one, we have some freedom in choosing our x\. To see how we may exploit this 
freedom, we observe from (6) that if | S\ | is unreasonably small then we may unnecessarily lose 
accuracy when we solve (6). To avoid this, let 

Cl = (J5T 0 ,ffi) ( ^ j 

be a QR decomposition of C x , then R(Ho) = R(Cf) and R(ff x ) = N(Cj). Assuming now 
|| *i h= 1 we get 

I 6 t |=|| C,*! || 2 =|| Z\H\x x || 2 <|| Z x || 2 || Hlx x || 2 . 

But || HqX x || 2 = 0 ma x(Ho x i) = cosO , <T max (-) is the maximum singular value of a matrix, and 
0 is the smallest angle between R {Cj) and R(xi) (see[l]). Hence from 


Ax — Ax / B\ 


< 5 , = 0 . 
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I tfl 1^11 * 1 1 2 cosff 

in order to make | 6i | as large as possible we need to make 0 as small as possible. Thus we will 
choose xi e N[U{*(Ai - Aj7)] that forms an angle 0 with R (H 0 ) as close to zero as possible. The 
following theorem taken from [1, p.582] and modified to meet our requirements, will give us a way 
to compute 0 as it was required above. 

Theorem 3: Let Ho and W\ form unitary bases of two subspaces. Let also 

H$W X = YY.X h 

be the singular value decomposition of HqWi, then the smallest angle between R(77o and R(kFj) 
takes place between vectors HqY e\ and W\Xe\ . 

Proof: See [1] 

Theorem 3 suggests the following algorithm for the computation of x\\ 

Compute the QR decomposition of Cf, Rj, and [U^{A\ — Ai I)] H = (Wo,VFi) 

Compute the singular value decomposition of V^W i = YYX H . 

Take x-y = W\Xe j. 

Finally to eliminate any doubt that = 0 when Aj is a strz we prove the following lemma. 
Lemma 3: N [U?(A t - Ai /)] C N(C X ) «=> {A! is a strz of (1),(2)} 

Proof: First we prove 



- Ai /)] = {(A x 7 - AiF'Buh \ u x € R m } . 


To do so, we see that, given Ui / 0 and computing x\ to satisfy 


(8) 


(Ai — X\I)xi = — B\U\ <=> 




( U 0 H 
[u? 


(Ai - Ax/)xi = 


xi 6 N[C/ 1 // (Ai - A,/)]. 


-R i 


Ui 


Thus 


{Vu, € R m , (A,/ - 6 N [U”{A X - A,/)]} => 
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{(A J - A 1 )~ 1 B 1 u l | «, G R m } C N [ff,"(A, - A,/)] . 


( 9 ) 


Let x\ : U( I (A\ — ^il)x\ = 0, then, since we can always compute 


u £ R m : R\U\ = Uq(A\ — Ai/)xi , 


we have 


vS 


u« 

from which we get 


(Ai — Ai/)a:i 


-R i 


o 


uj =» x\ = (Ai/ — i4i) l B\U\ , 


N[f/ 1 H (>1 1 - Ax/)] c {(Ax/ - I «1 e R m } . (10) 

From (9), (10) we may derive (8). 

Suppose now that 


{Ai is a strz of (1),(2)} 


{Vux G R m Cx(Ax/- Ai) _1 Bi«x = 0} 
CiN[^(Ax - Ax/)] = 0 
N[t/f (Ax - A,/)] C N(C,) . 


□ 


The algorithm that has been described so far in this section, can be used to allocate only one 
eigenvalue. We will show however that we may use it to allocate min{n,m + p— 1} eigenvalues. To 
do so we need to observe that only the first column of the current K{ is needed for the allocation 
of one eigenvalue. We also need to consider the fact that A(i4,’ — B{KiCi) = X(Aj — Cj Kj Bj). 
Given these two points then we may use the following algorithm, which for illustration we describe 
for m — 2 and p = 3, thus 


/ 

Ki = 

\ 


x 

X 


X 

X 



We allocate eigenvalues until the number of rows of the current Ki become greater than the number 
of columns. In our example this happens after the allocation of two eigenvalues, hence 


*3 = 


X 


X 
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At this point we continue working with the transposed system Aj — • Since Kj has 

two columns we are able to allocate two more eigenvalues instead of just one that we would have 
allocated if we had continued with A3. In the general case we keep transposing until we run out 
of eigenvalues or columns. Thus the total number of eigenvalues that we manage to allocate using 
this algorithm is 


m + p — 1 = 4 . 

Note that by following this algorithm we also satisfy the condition m < p at the beginning of each 
allocation. This condition has been assumed throughout this paper. Note that are associated 
with the columns of the matrix on the left of K{ or Kj and the rows of the matrix on the right of 
K{ or Kj respectively, rather than with JB, and Cj. 


5 Numerical Examples 


In this section we give two numerical examples to demonstrate the performance of our algorithm. 
The computation was performed on double precision (56- bit mantissa) using PC-MATLAB on a 
Toshiba T5100 which uses an 80387 coprocessor equipped with the IEEE floating point standard 
of arithmetic. In the computation below we show, up to 12 decimal digits of accuracy. 


Example 1: 


A = 


' 0.581314086914 
0.166717529296 
0.353500366210 
0.836242675781 
^ 0.244094848632 


0.504058837890 

0.157394409179 

0.441650390625 

0.200820922851 

0.936126708984 


0.559921264648 

0.665222167968 

0.373367309570 

0.072296142578 

0.607955932617 


0.741333007812 

0.933609008789 

0.771942138671 

0.598373413085 

0.154357910156 


0.303421020507 ^ 
0.144439697265 
0.078048706054 
0.638671875000 
0.154678344726 j 


B t 


' 0.549163818359 
0.942672729492 
l 0.613098144531 


0.843856811523 

0.008666992187 

0.065872192382 


0.178649902343 

0.239028930664 

0.300445556640 


0.409255981445 

0.142791748046 

0.576828002929 


0.595489501953 ^ 
0.671371459960 
0.726318359375 j 
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c = 


( 0.561660766601 0.332702636718 
0.427825927734 0.648269653320 
^ 0.455917358398 0.134307861328 


0.355514526367 

0.666625976562 

0.497573852539 


0.279266357421 

0.572494506835 

0.212463378906 


0.531448364257 ^ 
0.972076416015 
0.653732299804 j 


The eigenvalues to be allocated are 


{0.704589843750, 0.421768188476, 0.572113037109, 0.396102905273, 0.127380371093}. 


The K computed by our algorithm was found to be 


K = 


f -9.415631257941 
-36.155174309311 
^ 33.590525809043 


-1.520241736469 

-4.515055559417 

4.071903303950 


11.520696978251 > 
41.184483490595 
-36.923138190434 j 


The eigenvalues of A — BKC were computed and they were A(.4 — BKC) = {0.704589843749, 
0.421768188479, 0.572113037109, 0.396102905270, 0.127380371093}. 


Example 2: In the above example we had n = m + p — 1 , thus all n eigenvalues were allocated. 
However, if m + p — 1 < n then n — (m + p — 1) eigenvalues of A — BKC will take values that our 
algorithm has absolutely no control over. The following example demonstrates this point. 


A = 


( 0.356336616284 
-0.210549376245 
^ -0.355939324751 


-0.356626507847 

2.165165719183 

-0.506195941988 


-1.198870998736 ^ 
-0.882324378697 
0.721098719251 j 


B = 


( 1.606955436561 
0.062823512419 
^ -1.611627967397 


-0.407338058168 ^ 
-0.595038063525 
0.616657720777 } 



-1.182820557312 0.343687857622 


-0.357421120340 


) 


The eigenvalues to be allocated are {-0.406648486670, -0.366406484747, 0.853280016421}. 
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The K computed by our algorithm was found to be 

( -2.053475126112 \ 

K = 

^ -6.466096811958 ) 

The eigenvalues of A — BKC were computed and they were A(A — BKC ) = {-0.406648486670, 
-0.366406484747, 1.707616832510}. 

Our algorithm allocates one eigevalue at a time, thus complex eigenvalues need complex arith- 
metic. As a result, K may be complex. Investigation is under way to derive an algorithm that 
will allocate two eigenvalues at a time in a double step. In this way we will allocate a complex 
conjugate pair of eigenvalues in one double step using only real arithmetic. Hence K will be real. 


6 Conclusion 

We presented an algorithm for the pole assignment problem for multi-input, multi-output systems 
using output feedback. The algorithm uses deflation based on unitary similarity transformations 
and it allocates min{n, m + p — 1} eigenvalues. The same kind of deflation has been used in [3] to 
solve the corresponding pole assignment problem using state feedback. Since the algorithm in [3] 
has been proven to be numerically stable we hope the algorithm in this paper has the same property 
too. However, this needs to be proven by doing a rounding error analysis of the algorithm, and we 
plan to do this in the near future. 
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1 . INTRODUCTION 

The strong interaction between structural dynamics and active control is a 
well - recognized challenge in the analysis of controlled flexible structures. 
But it is only recently that the same interaction has been exploited in the 
design process. The traditional design approach in which the control design 
comes very late in the development — typically after the structure has been 
designed and built — is no longer viable. Although this approach has produced 
satisfactory results for the attitude control of relatively rigid space 
structures, it will not be suitable for the ambitious space missions that 
require precise controlled pointing of telescopes, interferometers and the 
vibration suppression for science instruments mounted on large flexible 
structures. In such systems, designing the structure and designing its control 
become entwined. This dictates early consideration of the control design — 
well before any detailed structural design is finalized. And just as the 
structure is optimally designed to meet such performance metrics as minimum 
mass or response to external disturbances, it should be optimally designed to 
meet its ultimate control performance as well. 

A natural way to introduce the control element into the overall design 
process is through an optimization procedure that combines the structural and 
control design criteria into a single problem formulation. A number of authors 
[1-6] have taken this perspective. In terms of the types of design parameters 
and constraints considered, Ref. (2) is probably the most extensive in that the 
design variables include structure parameters, actuator locations and feedback 
matrix. Static output feedback is used, and the performance objectives include 
total mass and robustness measures. Constraints are imposed on the eigenvalue 
placements, performance bounds, and structural constraints. Since not all of 
the constraints are commensurate, they are relaxed using a homotopy approach. 
Just as with Ref. (6), the approach taken in the present paper is not to 
produce the "best” optimized point design, but to produce a family of Pareto 
optimal designs representing options that assist in early trade studies. The 
philosophy is that these are candidate designs to be passed on for further 
consideration, and their function is more to guide the system design rather 
than to represent the ultimate design. 

An optimization approach that is consistent with this philosophy is to 
utilize multiple cost objectives that include an LQG cost criterion in 
conjunction with structural cost(s) , and possibly other control related costs. 
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After introducing the combined objective formulation in the setting of vector 
optimization, we derive the necessary conditions for Pareto optimality. No 

behavioral constraints are explicitly imposed in the problem formulation and a 
homotopy approach is used to generate a family of optimum designs. The intent 
of this paper is to explore this design approach and to provide an exposition 
of the computational aspects of the problem using numerical examples. 

2 . COMBINED OPTIMIZATION 

The combined optimization approach can be best appreciated when contrasted 
with the traditional sequential optimization. In the sequential optimization, 
the structure is first optimized by selecting the design variable, a (e.g. 
member sizes) which minimize a structural criterion J s (a) - often taken as the 
mass of the structure subject to some behavioral constraints h(a)>0 on 
deformations, stresses, open- loop frequencies, etc. 

min J (a) ; h(a)>0, acD (2.1) 

a 

where D is the physical domain for a. Second, having completely specified the 
optimal structural design a*, the control optimization is carried out with a* 
fixed. For example, LQG or optimal control designs pose the problem: 

min J (a*,C) (2.2) 

c c 

where J c represents either of the control criterion, and C is allowed to vary 
over the class of stabilizing compensators for the plant. 

By contrast, in the combined optimization formulation, the goal is to first 
merge the criteria of interest (here J s and J c ) into one using non-negative 
multipliers /?, and 8 , then optimize the combined criterion over the original 
design variable space a, C: 

min [f)J (a) + 6J (a,C] (2.3) 

a , C S 

The following expression compares the results of the two optimization 
procedures outlined above, 

min [/3J ( a ) + 5J (<*,C) ]<r_min f$J (a) + min SJ (a*,C)] (2.4) 

p s c s ~ c 

a , C a C 

The right-hand side of (2.4) corresponds to performing the sequential 
optimization by solving (2.1) for a*, followed by solving (2.2) for C*. Note 
that the optimal solution of the right-hand side is independent of f) and 8- 
but not so for the combined optimization embodied by the left-hand side. In 
terms of the total cost, expression (2.4) states the fact that the combined 
optimization is never inferior to the sequential optimization. In the vector 
optimization setting, the relative weighting of (3 and 8 serves as a parameter 
that allows the generation of an entire family of Pareto optima. 

In the present paper, only two objective criteria are dealt with. But it 
is not difficult to generalize the approach to incorporate other criteria such 
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as minimum open-loop frequency and certain controller robustness measures. In 
general these criteria are noncommensurate , and there is no unique solution 
that minimizes all criteria J;l, . . . simultaneously. Thus, one must look for 
Pareto optimal solutions as outlined below. 

First one assembles the criteria Jj/D+R, i~l,...,N into a single criterion 
J:D->R N , J(a) = (Ji(a) , . . . ,Jn(o) ) T . Tt\en the cone C Q = (xeR N :xj>0, i-l,...,N) is 
defined to induce a partial ordering < on R^ by x<y if y-x eC 0 . Now let aeD. 
A design vector a*€D is said to be (strongly) Pareto optimal if J (a) < J (a*) 
implies J(a)-J(a*). A necessary condition for Pareto optimality is contained 
in the following theorem due to Lin [7]. 

Theorem 2.1: If a* is Pareto optimal for the combined criterion J, and D is an 
open set, then there exists a nonzero vector ZcC D such that Z^J*(a*)-0. Here 
J* denotes the differential of J. 

For the two-term optimization problem in (2.3), we find that the Pareto 
optimal solution to J - (J s , J C ) T can be generated by solving for the necessary 
conditions for extremizing the following convex combination J^: 

J - (1- A) J (a) + AJ (a , C) ; Ac [0,1] (2.5) 
A sc 

where A replaces 6 in (2.3). The form of Eq. (2.5) suggests a homotopy (or 
continuation) approach for generating a family of Pareto optima as A is 
propagated from 0 to 1 . 

3. NECESSARY CONDITIONS 

We begin with the n s degree-of-freedom dynamical system 


M(a)r + D(a)r + K(a)r - G^u + G 2 V 


(3.1) 


where M, D, and K are the n s xn s mass, damping and stiffness, is the constant 
n s xn u control influence matrix, and G 2 is the constant n s xnd disturbance 
matrix. The vectors r, u, and v are respectively, physical degrees -of- freedom, 
control forces and disturbances. Let x - (r,r) . Then the first order state 
equation is 


x - A(a)x + B^(a)u + 1 * 2 ( 0 ) u + v' 
where 



0 

1 • 


’ 0 

r 0 1 

A - 

1 — j 

£ 

i 

* 

-M _1 D 

. »1 " 

M _1 Gi 

’ B 2 “ M" 1 G 2 J 


(3.2) 


(3.2a) 


and an additional disturbance v' independent of a has been introduced for 
greater flexibility of the formulation. We assume that (3.2) has measured 
output variables y and controlled output variables z: 


y - C^x + w, 
z - C 2 X 
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(3.3) 



and that v , v* , and w are uncorrelated white noise processes with intensities 
Q v , V , and Qw. respectively. 

In the remainder of this paper, the total mass of the structure is assumed 
to represent the structural criterion J s . Thus, for a structure consisting of 
n a one -dimensional finite elements, each having a cross-sectional area , 
length and density p : 


J 

s 



pi .a . 
l l 


(3.4) 


For the control criterion J c associated with (3.2) and (3.3), we assume the LQG 
index 


J - lim E{z^(t)Dz(t) + x^D-x + u^(t)Ru(t)J 
c 1 

t-*» 


(3.5) 


where E is the expectation, D and D]^ are non-negative definite weighting 
matrices, and R is positive definite. Although is assumed independent of a, 
D could possibly depend on a. The latter case would arise, for instance, if 
the first term in (3.5) were to represent the total energy in the system with 
z - (r,r)^ and D - diag(K,M). Under standard assumptions of stabilizability 
and detectability, the optimal compensator C* for (3.5) is implemented by [8]: 


u 

o 


T 

-B*Px 


o 


x - (A-K-C)x + K~y + B-u , 
o f o 1 o 


(3.6) 


and the optimal cost J c associated with this compensator is 


J (C) - tr{P(B 0 Q B* + V) + P,PB.R' 1 B:F P) . (3.7) 

C Z V Z til 

where P and Pf are the unique positive definite solutions to the algebraic 
Riccati equations 


T 

A P + 


PA + D - 
o 


-1 T 

PBjR B* P 


(3.8) 


AP f + PfA T + B^bJ + V - P f C^ Q w 1 C 1 P f - 0 

and D q - C* DC 2 + D.^ and K f - P f C* C^ 1 

With the above representation for J c , we seek the optimality conditions for 


min J , (a)-(l- A) J (a)-Kt/ ,a>+Atr{P(B 0 Q B^+V)+P jr PB 1 R' 1 B^ P) (3.9) 

A s Z v Z t L l 

a 
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n a 

subject to the constraints in (3.8). In (3.9) an n a - dimensional vector i/eR 
has been introduced to "regularize" the problem and to serve the purpose of 
initializing the homotopy path. 

Proposition 3.1 Let E+ denote the set of n x xn x positive definite matrices. 
The optimality conditions for a* to be a local extremum of (3.9) subject 
to (3.8) are the zeros of the function H: RxR na x E + x E+ -» R na x E+ x E+ 
defined by H - [H]_, H 2 , H 3 , H 4 , H 5 ] , where 


H 1 (A,a I Z 1 ,Z 2 ,P > P f ) 


8J 3(B Q B ) 

(1-A)~ ~ + 1 / .+\tr { P 

oa^ 1 


3 < B 1 R ' 1b I> _ 3A T 3A , 3D 0 

3a. + PP £ P + Z 1 lfc: P + ^ + 3a. 

1 i 111 


-P 


a(B 1 R' 1 B^) 

5a. 

1 




P l + VST P f + P f 77. * 3a, 

1 11 


2 a! , 3<c ? 


3a. P f)>’ 

1 


i”l > • • • I 


(3.10a) 


H 2 (A, a, Z 1 ,Z 2 , P, P f ) 


(A-B^ - 1 B^P) Z x +Z t ( A 1 - PBjR' X bJ ) +B 2 Q u B 2 +P f PBj^R' ^♦BjR* 1 B*PP f 


(3.10b) 


H 3 (A, a, Z v Z 2 , P, P f ) 


(A 1 -^ 1 C x P f ) Z 2+ Z 2 (A-P f C^Q w 1 0^+ PB^'^P 


(3.10c) 


H 4 (A, a, Z r Z 2 , P, P f ) - A T P+PA+D o -PB 1 R' 1 B 3 P, 


(3 . lOd) 


H 5 (A, a, Z r Z 2> P. p f ) - AP f+ P f A T + B 2 Q u B^V-P f c{q w 1 


(3 . lOe) 


The proof of proposition (3.1) is omitted here to conserve space, but is given 
in detail in Ref. [9]. Thus, for the LQG formulation, the necessary conditions 
involve solving two algebraic Riccati equations (3.10d, e) , two Lyapunov 
equations (3.10b, c) , and a gradient equation (3.10a) for a^, i — 1 n a . In 
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the case of the LQR formulation, it is easily verified that the optimization 
statement expressed by (3.9) reduces to 


min 

a 


J (a)-(l-A)J (a)4<j/,a>+Atr{P(B „Q B*+V) } 

A S Z V Z 


(3.11) 


and that the optimality conditions reduce to finding the zeros of the simpler 
function H 2 , H 3 ] involving one Riccati and one Lyapunov equation 
(instead of two as in LX)G) , in addition to the gradient equations: 


5J s 

H(A,a,Z,P) - (1- A) ^ + v i+ Atr{ P 1 


.. 3D 3(B 1 R' 1 bT) 

«[2P^^.P 1,. y P1); 


da^ 


da. 


i- 1 , 


,n 


(3.12a) 


H_(A,a,Z,P) - A Z + ZA + B„Q B^ + V, 
^ c c 2 v 2 


(3.12b) 


H (A,a,Z,P) - A T P + PA + D - PB^R'^P, 
3 oil’ 


(3.12c) 


with A c - A - B^R'^-B^P. 


4 . HOMOTOPY STRATEGY 

For all At [ 0 , 1 ] , our goal is to minimize (3.9) in the case of the LQG 
formulation or (3.11) in the case of the LQR formulation by finding the design 
variables a that satisfy the corresponding optimality conditions (3.10) or 
(3.12). The basic strategy is: given the solution at a value A Q , smoothly 
propagate it to a new solution at A 0 +AA via some local mechanism such as Newton 
method or iterative optimization. This strategy has been analyzed in detail in 
Ref. (9). In the following, only a summary of the results is given without 
proofs, assuming the LQR formulation. 

Let x denote a generic point (a, Z, P)e R na x £ + x £+ so that H(A , x) stands 
for H(A , a, Z, P) . In determining the zeros of H, the following proposition 
asserts that in a small neighborhood about the optimal at A— 0, there is a 
smooth path parameterized by A consisting of the global optimal solution. 

Proposition 4.1: Suppose that min J s (a) has a unique global solution a* 
satisfying the second order sufficiency condition on the positivity of the 
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Hessian [J s (a*)], a a >0. Further, assume that: J s is coercive so that 

i j 

|J s (a)|-**> as |a |-**>. Then there exists c>0 such that (3.11) has a unique global 
solution for \<e . 


The next proposition provides a sufficient condition for the path to remain 
locally optimal. 

Proposition 4.2: Let $(A) — (A, x(A)) denote a smooth path in 

[ 0 , r ) xR na x £+ x £+ with H(^(A)) - 0 and H, x invertible for Ae[0,r). Such an r 
is guaranteed by Proposition 4.1. Then a(A) is a local minimum for J A for each 
Ac [0,r) . 

The purpose of the following lemma is to demonstrate that the zero set of H 
is "generically " well-behaved. 

Lemma 4.3: Suppose that H(0,x)~0 has a unique solution. Then for almost every 

choice of (i/, V, D 1 )fR na x x )] + , the solution path emanating from (0. x*) is 
diffeomorphic to the real R and every other component of H‘ x (o) is 
dif feomorphic to either a circle or R. 


Thus, following the path defined by one of the zero curves of H, not just 
the one emanating from the optimal at A— 0, will not lead to a pathological 
behavior such as bifurcations or curves with infinite length in bounded sets. 
Another fundamental and generally difficult question that arises when employing 
homotopy methods is whether or not the path remains bounded. The following 
result provides a partial answer to this question. 


Proposition 4.4: Suppose that J s , B^, 0 o , 

a l»--- a na» anc * assume coercivity of J s (o)/|ar| 


and A are all polynomials in 
If H(0,x)-0 has a unique 
solution, then for any c>0 and for almost every triple (i/, V, D^) 
cR na x x £+, the component of H'^O) containing (0,x*) is a bounded set In 
[ 0 , 1 - € ] xR na X X+ X l+. 


5. NUMERICAL RESULTS 


The numerical experiments described in this section demonstrate the results 
of the foregoing theory. Two prototype examples are used; both employing the 
LQR formulation. Implementation of the homotopy strategy of the previous 
section is achieved by iterative optimization, wherein the solution path for 


minimizing (3.11) in terms of the homotopy parameter A starts at A-0 with 
<*!>••• ,c* na initialized to a predetermined sufficiently small allowable size a Q . 
At this point in the solution space, is fully weighted toward minimizing J s 
only. Minimization of J^ Q thus yields a Q * for which H(A Qf x o *)-0. For the 
next iteration and for every succeeding one, A is incremented and the weighting 
is shifted gradually toward J c . For a typical iteration j, the following steps 
are performed: 


(i) Aj <= Aj_i + AA j _ \ 

(ii) Initialize the minimization of J ^ j by using a y t=a ' k ^-l* This will 
result in aj* for which conditions (3.12) hold. 
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In performing the minimization in (ii) above, we employed the Automated Design 
Synthesis (ADS) system of general purpose subroutines [10] . ADS provides a 
wide selection of options at three levels: strategy, optimization, and line 

search. Available strategies include sequential linear and quadratic 
programming, and sequential unconstrained minimization coupled with various 
penalty methods. At the optimization level, one can choose between Fletcher- 
Reeves algorithm and the variable metric method of Broydon-Fletcher-Goldfarb- 
Shanno (BFGS) for unconstrained minimization, or Zoutendijk's method of 
feasible directions and modifications thereof for constrained minimization. 
For one -dimensional line search, the options include a combination of 
polynomial interpolation/extrapolation, solution bounding, and the method of 
Golden Section. Not all combination of options are compatible at the three 
levels, and the program parameters must be adjusted to suit the problem at 
hand. For this purpose, an analytical function was contrived which possessed 
such features as: easy to compute closed form solution, multiple minima, and 

insensitivity of the functions gradient near the minima to design parameters. 
Several compatible options were tried and the program parameter values (e.g., 
move limits and convergence criteria on the absolute and relative changes in 
objective function between two successive iterations) were adjusted until the 
closed form and numerical solution agreed within as few iterations as possible. 
As a result of these numerical experiments, the popular BFGS variable metric 
method for unconstrained minimization emerged as the one of choice for use in 
the combined control -structure optimization examples that follow. During the 
one dimensional line search, the minimum is located by first computing the 
bounds, then using polynomial interpolation. 

Example 1: The cantilever beam of Fig. 1 resembling a flexible appendage' of a 

large structure is modelled by three finite elements with two degrees -of - 
freedom (dof) at each node. The structural design variables are the x- 
sectional areas a\ 9 <* 2 , aj of the elements. The disturbance u represents a 
transverse sound pressure modelled by uncorrelated unit impulses at t-0 
concentrated at the three nodal transverse dof. The control force u is applied 
at the free end along the transverse dof direction. With the J s given by 
(3.4), we seek the minimum of (3.11) for Ae[0_,l], subject to conditions (3.12). 
Here, the weighting matrices were taken as D-IO^ x diag(K,M), and R-10~\ and 
the initial used were a£-a o -lxl0“^ , i-1,.,.,3. 

Table 1 lists the family of Pareto optimal designs that minimize 

J^\ , Ac [ 0 , . 99 ] along with the corresponding values for J c , J s and J^. The 
variations of the Pareto optimal J C *(A) , J S *(A) and J^*(A) are shown in Fig. 2. 
A number of observations can be made from Table 1 and Fig. 2: 

(i) The noncommensurate nature of the two costs J c and J s is apparent as the 

weight is shifted between them: while J c is a strictly decreasing 

function of A, J s is a strictly increasing function of A. This is 
consistent since a stiffer structure requires less control energy. 

(ii) Except near A-0 , the optimal structural shapes that minimize J \ for the 
disturbance, choice of D and R, and control forces described above are 
essentially 03-02 nea r the fixed end, and a much smaller 03 near the 
free end. This is a physically plausible optimum shape that minimizes 
the mixture of high strain energy density near the fixed end and high 
kinetic energy density near the free end. Other choices of disturbance, 
control location and D, R are expected to alter the optimal beam shape. 


162 


(iii) Although the design at A-0 is guaranteed to be globally optimal 
(Proposition 4.1), it is possible that designs generated as A is 
continued may be only locally optimal. To assess this possibility, the 
optimal designs listed for A”. 200, A“.700 and A - . 900 were re-examined 

separately. For each case, the minimization was restarted with randomly 
selected initial Qi values. In most cases, the minimization converged 
to the same or to a higher minimum than obtained in Table. 1. 



Example 2: The beam in this example (See Fig. 3) simulates a flexible 
appendage (length - 45 m) attached to a rigid hub (inertia - 50 kg.m / ) to which 
a control torque is applied to counteract the transverse unit impulse at the 
fr 0 e end. The beam is modelled by three finite elements of constant width * 

, 001 m, but whose nodal depths d^ d 4 represent design variables having a 

lower bound - .001 m. Here again, J s represents the total mass of the flexible 
beam (excluding the hub). For the control objective J c , the response energy is 
weighed by D]^ so as to minimize the transverse free end displacement, and R is 
taken - l.xlO'^. 

Table 2 and Fig. 4 represent analogous results to those presented for 
Example 1 in Table 1 and Fig. 2. In addition to observation (i) of the 
previous example - which holds here as well - the following remarks can be made 
with reference to the results of this example: 


163 



(i) For small values of A (e.g. A-0.1), where the total mass J s dominates 

the minimization of J^, the optimal shapes tend to have a small slope 
from d^-+d 2 ~* f d 3 , followed by a sharper slope from d 3 ^d 4 « As A increases, 
minimizing J s becomes less important than minimizing J c (tip 
displacement response energy plus control energy). As a result, the 
beam becomes gradually stiffer, and the monotonic slope from 
associated with small A values gradually disappears at A=*.45, then gives 
way to a pronounced inflection of slope for d 3 ~>d 4 for A>.45. This 
results in a larger allocation of mass at the tip. This type of shape 
is physically consistent with the requirements of the two parts of the 
control objective J c : a stiffer structure near the hub that is reduced 

toward the tip (free end) makes best distribution of mass, while 
minimizing the tip displacement response. On the other hand, a large 
mass at the tip (where the disturbance exists) makes the disturbance 
less effective - thus requiring less control effort. 


(ii) To confirm the above interpretation, the case of A=*0 . 7 in Table 2 (i.e. 

r« 10 - ^) was re-examined for smaller and larger values of R; R~10“° and 
R~10"2, respectively. As Fig. 5 shows, smaller values of R give more 
weighting to the tip displacement response energy part of J c , thus 
giving rise to the optimum shape being a stiffer structure near the hub 
which is reduced toward the tip. Conversely, larger values of R (e.g. 
R* 10 ~ 2 ) give more relative weighting to the control energy part of J c , 
which is best minimized by the presence of the heavier tip mass. It is 
interesting to note the similar effect of varying R and varying A on the 
optimal shapes . 


Table 2. Pareto Optimal Designs for Example 2. 


A 

d l 

Optimum Design 
d2 d 3 

d 4 

d c 

Js 

J A 

.000 

.001 

.001 

.001 

.001 

1 . 6xl0 +10 

.075 

.075 

.0001 

.02404 

.02363 

.02291 

.01366 

3355.26 

1.628 

1.963 

.001 

.03552 

.03485 

.03359 

.02065 

482.6 

2.404 

2.884 

.010 

.05223 

.05134 

.04940 

.03043 

70.78 

3.54 

4.21 

.100 

.07699 

.07559 

.07157 

.05303 

10.03 

5.28 

5.757 

.200 

.08759 

.08563 

.08031 

.06659 

5.33 

6.05 

5.906 

.300 

.09550 

.09267 

.08657 

.07847 

3.54 

6.62 

5.703 

.400 

.10258 

.09867 

.09187 

.09085 

2.56 

7.15 

5.31 

.500 

.10944 

.10389 

.09670 

. 10470 

1.93 

7.66 

4.79 

.600 

.11715 

.10948 

.10135 

.12176 

1.46 

8.22 

4.17 

.700 

.12657 

.11576 

.10681 

.14516 

1.08 

8.92 

3.43 

.800 

.13944 

.12322 

.11304 

.18055 

.77 

9.87 

2.59 

.900 

.16369 

.13641 

.12430 

.24923 

.47 

11.63 

1.59 

.920 

.17299 

. 14104 

.12893 

.27374 

.41 

12.28 

1.355 

.940 

.18542 

.14815 

.13565 

.30544 

.34 

13.18 

1.107 

.960 

.20578 

.16117 

.14818 

.35131 

.26 

14.64 

0.83 

.980 

.25035 

.19468 

.18421 

.42401 

.16 

17.83 

0.52 

.990 

.29401 

.25488 

.24878 

.48165 

.09 

22.20 

0.32 


164 




(iii) Of general interest to problems in combined optimization - at least 
numerically - is the question as to the degree of "roughness" of the 
hyper-surface J A (a) . A partial answer to this question is provided in 
Fig. 6 after introducing idealizations that reduce the number of 
variables from four (di,...,d 4 ) to only two (d 3 , d 4 > , so that a three 
dimensional plot could be generated. Figure 6 shows such a surface in 
the neighborhood of the optimum for the case A _ 0.7 in Table 2. This is 
achieved by fixing d^**.13, allowing dj and d^ to assume various values 
larger and smaller than those in Table 2 for A— 0 . 7 , and letting 
d 2 -h(d]_+d 3 ) • Assuming that the idealizations above (which led to 
reducing the dimensionality of the surface) did not alter the basic 

topology of J A surface, it appears from Fig. 6 that J A is a smooth 
function of the design variables - at least in the neighborhood of the 
minimum. Furthermore, with these idealizations it appears that is 
relatively flat near the minimum along the d 4 axis, and that the optimum 
shape is some linear combination of the four basic shapes depicted at 
the corners . 

6 . CONCLUSIONS 

An approach for combined control -structure optimization keyed to enhancing 
early design trade-offs has been outlined and illustrated by numerical 
examples. The approach employs a homotopic strategy and appears to be 

effective for generating families of designs that can be used in these early 
trade studies. 

Analytical results were obtained for classes of structure/control 
objectives with LQG and LQR costs. For these, we have demonstrated that global 
optima can be computed for small values of the homotopy parameter. Conditions 
for local optima along the homotopy path were also given. Details of two 

numerical examples employing the LQR control cost were given showing variations 
of the optimal design variables along the homotopy path. The results of the 
second example suggest that introducing a second homotopy parameter relating 
the two parts of the control index in the LQG/LQR formulation might serve to 
enlarge the family of Pareto optima, but its effect on modifying the optimal 
structural shapes may be analogous to the original parameter A . 
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Figure 1 


EXAMPLE 1. 

CANTILEVER BEAM PROBLEM 



STRUCTURAL MODEL: MASS DENSITY > 1660 Kfl/m 3 , MODULUS * 9.5« x 10’° N/m* 
MODAL DAMPING > 0.5% 

CONTROL: DISTURBANCE * TRANSVERSE PRESSURE IMPULSE CONCENTRATED 

AT THE NODES 

RESPONSE ENERGY WEIGHTED BY D * Olag (K.M) x 10 2 , 
CONTROL ENERGY WEIGHTED BY R « 1 x 10* 

DESIGN VARIABLES: a,, a 2 , a, * 1 X 10' 7 

Figure 2 

CANTILEVER BEAM OPTIMIZATION 
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Figure 3 


EXAMPLE 2. 
HUB-BEAM PROBLEM 



STRUCTURAL MODEL: MASS DENSITY * 1660 Kfl/m 3 , MODULUS a 0.56 x 10 10 N/m 2 
MODAL DAMPING a 0.9%, 

CONTROL: DISTURBANCE a UNIT IMPULSE 

RESPONSE ENERGY WEIGHTED TO MINIMIZE END DISPLACEMENT, R a 1 x 10 4 

DESIGN VARIABLES: d v - • •, d 4 £0.001 


Figure 4 

HUB-BEAM OPTIMIZATION 
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PARALLEL PROCESSING OP REAL-TIME DYNAMIC SYSTEMS SIMULATION ON 
OSCAR (Optimally scheduled Advanced mult i processoR ) 


Hironori Ka Sahara* , Hiroki Honda and Seinosuke Narita 
Dept, of Electrical Engineering, Waseda University 
3-4-1 Ohkubo Shinjuku-ku, Tokyo, 169, Japan 


Abstract 

This paper presents parallel processing of rea 1 t iae dynamic systems 
simulation on a multiprocessor system named OSCAR. In the simulation of 
dynamic systems, generally, the same calculation are repeated every time 
step. However, we cannot apply the Do-all or the Do- across techniques for 
parallel processing of the simulation since there exist data dependencies 
from the end of an iteration to the beginning of the next iteration and 
furthermore data- input and data-output are required every sampling time 
period. Therefore, parallelism inside the calculation required for a 
single time step, or a large basic block which consists of arithmetic 
assignment statements, must be used. In the proposed method, near fine 
grain tasks, each of which consists of one or more floating point 
operations, are generated to extract the parallelism from the calculation 
and assigned to processors by using optimal static scheduling at compile 
time in order to reduce large run time overhead caused by the use of near 
fine grain tasks. The practicality of the scheme is demonstrated on OSCAR 
(Optimally Scheduled Advanced multiprocessoR) which has been developed to 
extract advantageous features of static scheduling algorithms to the 
maximum extent. 


I. INTRODUCTION 

High speed dynamic systems simulation, or solution of ordinary 
differential equations, has been required to simulate dynamic behaviors of 
various systems such as airplanes, missiles, nuclear reactors and robots, 
in real-time. So far, the dynamic systems simulation has generally been 
performed on traditional analog or hybrid computers or on general-purpose 
digital computers by using a simulation language like CSMP (Continuous 
Systems Modeling Program). However, these approaches have several 
problems, for example, operational accuracy and realization of non-linear 
functions for the analog computers and high processing cost and real-time 
input-output for the general purpose main-frame computers. 

In an attempt to resolve these problems, the use of parallel 
processing techniques [ 13- 14 ] has attracted much attention. In fact, 
various parallel processing schemes, especially parallel processing using 
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multiprocessor syst ens [ 1 4- 15 ] , have been so far proposed[l-4] . The 
differences among the schemes lie in the choice of task granularity and 
task scheduling. For example, Korn [1] and Koyama [4J employed a large 
task size (coarse task grain) approach where the computation for the 
numerical integration of each equation in a set of first-order 
simultaneous differential equations was selected as a task. The generated 
tasks are assigned properly by the user to a relatively small number of 
processors. The functional distribution approach by Gilbert, et al[2], 
dealt with each fundamental operation (four fundamental arithmetic 
operations, integration and so on) as a task to be assigned to a dedicated 
hardware operational unit. Yoshikawa, et al[3], also adopted an approach 
where each fundamental arithmetic operation was assigned to one processor. 
The common problem left unsolved to these approaches was poor parallel 
processing efficiency stemming from the lack of efficient methods which 
allocate the generated tasks onto an arbitrary number of parallel 
processors in an optimal manner. This paper proposes a parallel 
processing scheme for the solution of the above-mentioned problem by using 
static minimum execution time multiprocessor scheduling algorithms[5] [ 10] 
already developed by the authors for optimum task allocation. The 
proposed parallelizing compilation scheme consists of the following 
processes: task generation, optimal task scheduling, and generation of 
machine codes to be executed on respective processor element. 

The effectiveness and practicality of the proposed scheme are 
demonstrated on OSCAR's processor cluster with sixteen 32-bit RISC-like 
processor elements which has been designed to extract advantageous 
features of static scheduling at compile time to the extent. 


II. A PARALLEL PROCESSING SCHEME USING STATIC SCHEDULING 

Generally, dynamics of most continuous- time systems can be modeled by 
the following explicit first-order simultaneous ordinary differential 
equations: 


dx|/dt=f j(t ,xj ,x 2 , . . . .Xjb) (i=l ,2 m) 

Therefore, the dynamics systems simulation can be regarded as the 
solution of the ordinary differential equations. Hence, this paper 
handles parallel solution of the equations using various numerical 
integration formulae such as Euler. Trapezoidal, 3rd- and 4th-order Adams 
Bashforth, 4th-order Runge Kutta and 4th-order Adams Moulton (predictor- 
corrector method) listed in Table 1. In applying these integration 
formulae, the computation required for each integration step consists of 
arithmetic assignment statements to evaluate the derivative of each 
equation and to perform numerical integration. Between consecutive 
iterations, there exist data dependencies[ 16- 17] from the end of an 
iteration to the beginning of the next iteration. Furthermore, for real- 
time simulation, data input and data output are required every iteration 
or few iterations, namely every sampling period. Therefore, we cannot 
apply Do-all and Doacross techniques to parallel processing of the 
dynamic systems simulation which are popular parallel processing schemes 
for a Do loop on a multiprocessor system! 18] [19]. 

Taking into consideration these facts, in order to realize efficient 
parallel processing of the simulation, we must parallel process a block of 
arithmetic assignment statements, or a basic block, in each iteration. 
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However, the parallel processing of the basic block on a multiprocessor 
system has been thought to be very difficult since data transfer overhead 
and synchronization overhead are relatively large. The proposed scheme 
allows us to minimize these overheads and to realize efficient processing 
by generating optimized machine codes based on the static schedule at 
compilation time. 

A. Task Generation 

As mentioned before, in the dynamic systems simulation, we must 
process each iteration in parallel though we can sometimes unroll a few 
iterations if data input and output should be made every few iterations. 
In order to process the iteration in parallel, first of all, we must 
generate tasks with suitable granularity, which are basic units assigned 
to processors. As for the task granularity, several levels may be 
perceived: equation level, operation element level, and intermediate 
level. In the case of equation level granularity, the computation related 
to each subscript i for each numerical integration formula listed in TABLE 
1 (the computation of a derivative and that of numerical integration 
corresponding to each formula for each variable Xj) is considered to be a 
task. When operation element level granularity is adopted, the computation 
for each derivative or for each numerical integration is subdivided into 
finer fundamental operation elements such as the four arithmetic 
operations and trigonometric functions, each of which is taken as a task 
and allocated to the processors (fine granularity). In the intermediate 
task granularity, several floating point operations are combined to form a 
task. For instance, when Van der Pol's equations 

dxi/dt = x 2 2 
dx 2 /dt = 6X 2 " X 1 #EX 2 _X 1 

is decomposed into fairly small intermediate- level tasks, three 
multiplication tasks, two subtraction tasks, and two integration tasks 
(including several floating point operations) are generated. Fig. 1 
depicts the block diagram representation of the seven tasks, with data 
dependencies explicitly shown. 

There exists no general rule for determining the best task 
granularity applicable to all kinds of dynamic systems. When parallel 
processing is performed on a multiprocessor system with little data 
transfer and synchronization overheads among processor elements, the 
operation element level granularity is known to be most advantageous to 
achieve minimum processing time because parallelism can be exploited to 
the maximum extent. For a large-scale problem (the order of simultaneous 
equations is very high) or a multiprocessor system with poor data transfer 
capabilities, however, the operation element level granularity does not 
always give the best performance. In other words, much attention must be 
paid to such factors as processor speed, interprocessor data transfer 
speed, size and parallelism inherent to the problem in hand, and 
complexities of scheduling mechanisms (both software and hardware) [7). 
Namely, we must choose the best granularity for each problem and each 
multiprocessor system. For this reason, the proposed parallel processing 
scheme provides with two methods for the input of simulation source 
programs. The first method employs a simplified simulation language shown 
in Fig. 2, which allows direct input of mathematical equations. The user 
can specify arbitrary task granularity from the operation element level to 
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the equation level. The second Method facilitates the input of block 
diagram representations such as those employed for analog computer. As 
shown in Fig.l, each operational element of analog computer (adder, 
integrator, etc.) can be taken as a task, to realize near fine 
granularity. Medium granularity can also be dealt with by combining 
automatically several tasks with near fine granularity. (This process is 
referred to as task fusion). In what follows, emphasis will be placed on 
the case of near fine granularity, namely the finest granularity that can 
be treated by use of the proposed scheme on a multiprocessor system named 
OSCAR mentioned later. 

Next, the proposed parallel processing scheme analyzes precedence 
relations caused by data dependencies among the generated tasks and 
represents the task precedence relations by a task graph like Fig. 3 which 
is a directed acyclic graph (DAG). The precedence constraints represent 
the restrictions existing among tasks regarding the execution order of 
tasks. The existence of task i precedent to task j means that the 
execution of task j cannot be initiated before the completion of task i. 
The precedence relation can be examined by the data flow analysis among 
tasks. When the data flow analysis is made, the output variable of each 
integration task is treated as an initial value. Each node in the task 
graph stands for a task and an arc between a pair of nodes for the 
precedence constraint. Nodes 0 and 8 are not actual nodes but dummy nodes 
introduced for the sake of convenience. They represent the entry node and 
the exit node, respectively. The figure beside each node represents the 
estimated processing time of the corresponding task. Since the actual 
processing time does not usually take on a fixed value but varies with the 
data to be processed, the average value or the worst-case value is 
employed as the input[7], which is used in the scheduling algorithms to 
be described in the subsequent section. When the average value is used for 
each task, the resultant schedule gives the minimum value of the average 
processing time of the task set. Similarly, when the worst-case value is 
used, the worst -case processing time is minimized. However, OSCAR, which 
is a target machine in this paper, can execute all instructions including 
a few floating point operations in one clock by employing RISC like 
processor. Therefore, we don't have the above mentioned problem on OSCAR, 
a compiler can estimate accurate processing time of each task. 

Once a task graph is generated, the minimum possible processing time 
achieved by parallel processing of the tasks can be estimated as the 
critical path length t cr of the task graph. In Fig. 3, the critical path is 
shown by double- line segments. 

An unique task graph can also be generated by following simple 
procedures in the case of the block diagram input mode. The task graph 
shown in Fig. 3 represents the computation in one integration step when 
the tasks are generated in the size of near fine granularity and the 
numerical integration method employed is Euler, Trapezoidal or 3rd- or 
4thorder Adams Bashforth. The integration task involves computation 
specific to each numerical integration method. When the 4th-order Runge- 
Kutta method is employed, k^ through k 4 need to be evaluated, and the 
computation described by this task graph is repeated four times or the 
expanded task graph involving the computation repeated four times is 
processed for each integration step. In the former case, the content of 
each integration task to be processed differs with the iteration count in 
order to evaluate k^ through k 4 and their weighted average. Similarly, 
when a predictor- corrector method such as the 4th-order Adams-Moulton is 
used, the task graph is computed twice or the expanded task graph to 
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represent the unrolled computation is processed for each integration step. 
In the former case, the computation corresponding to the predictor of the 
integration task is performed first, followed by the computation for the 
corrector. 

As mentioned earlier, the task graph shown in Fig. 3 represents the 
case where the tasks are generated in the size of near fine granularity. 
When coarse granularity at the equation level is employed for task 
generation, the portion surrounded by the dashed lines becomes a task. 
Also, in the case of fine granularity, the portion of each integration 
task is replaced by a subgraph generated by subdividing it into the 
operation element level. 

It should be mentioned here that parallel processing scheme proposed 
in this paper is so designed that the tasks generated in either fine or 
near fine granularity level can be fused automatically without sacrificing 
much parallelism. As a simple example, when there exist a pair of 
successor task (son node) with only in-edge and the predecessor task 
(father node) with only one out-edge, the two tasks are fused into a 
single task. Even such an easy task fusion technique allows the 
optimization of resister utilization and avoids unnecessary data transfer 
for more efficient parallel processing. 

B. Scheduling Algorithms 

In order to process the set of tasks on a multiprocessor system 
efficiently, the assignment of tasks onto the parallel processors and the 
execution order among the tasks assigned to the same processor must be 
determined optimally. The problem which determines the optimal assignment 
and execution order can be treated as the traditional multiprocessor 
scheduling problem of which the objective function is the minimization of 
the parallel processing time or schedule length [5][8]. To state formally, 
the scheduling problem is to determine such a nonpreempt ive schedule that 
the execution time or the scheduling length be minimum, given a set of n 

computational tasks T (T1 Tn) , precedence constraints among the 

tasks and n processors with the same processing capability. This problem, 
however, has been known as a "strong" NP hard problem [9]. In other 
words, unless P=NP, it is impossible to construct not only a pseudo- 
polynomial time optimization algorithm but also a fully polynomial time 
approximation scheme. With this fact in mind, the authors have 
successfully constructed a heuristic algorithm named CP/MISF and an 
efficient practical algorithm called DF/IHS [5J. The former algorithm can 
provide very precise approximate solutions quite rapidly because of its 
very low time complexity. The latter algorithm can obtain optimal 
solutions or approximate solutions with guaranteed accuracies from optimal 
solutions by combining CP/MISF and depth-first search. In what follows, 
the two algorithms are explained very briefly. For further details, the 
reader is referred to the literature [5]. 

1) CP/MISF(Critical Path/Most Immediate Successors First) Method 
This method essentially is a kind of list scheduling algorithms. 

step.l Determine the level lj for each task. The 1 j is the longest path 
from Hj to the exit node. 

step. 2 Construct the priority list in the descending order of lj and 
the number of immediately successive tasks, 
step. 3 Execute list scheduling [8] on the basis of the priority list. 
Since the list scheduling may be regarded as a method to construct the 
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schedule for the case where a set of tasks are processed in parallel in 
the data-driven Banner considering the priority assigned to each task, it 
can be easily extended to dynaaic scheduling at run tine. 

Furthermore, the list scheduling can be also aodified to eliminate 
unnecessary data transfer among processors. In the modified algorithm 
CP/DT/MISF method[10], when the tasks with the same priority are allocated 
to a processor, a task is allocated to the processor which needs the 
minimum data transfer to execute the task. This simple modification 
significantly decreases the data transfer overhead for the multiprocessor 
system with poor data transfer performance. 

Its average performance was evaluated for a total of over nine 
thousand test cases by comparing the CP/MISF solutions with the lower 
bound function[ll] . Optimal solutions were obtained for 67 percent of the 
cases tested. Approximate solutions with errors of less than 5 percent 
were obtained for 87 percent of the cases and those with errors of 10 
percent for 98.5 percent of the cases. The worst -case performance of 
CP/MISF, i.e., the error of the worst -case solution t obtained by 
CP/MISF from the true optimal solution t opt is given by 

(t-topt^opti 1 /® [51 ■ 

2 

In addition, the time complexity of CP/MISF is 0(n +mn). For problems with 
about one thousand tasks, it only takes a few ten seconds on a HITAC M280H 
system. In sumuary, CP/MISF is suitable for the solution of very large 
problems with hundreds or even thousands of tasks. 

2) DF/IHS (Depth First/Implicit Heuristic Search) Method 

DF/IHS is an optimization/approximation algorithm to determine schedules 
(solutions) which are always more precise than those by CP/MISF. The 
method combines CP/MISF and depth-first search in a special manner and 
reduces markedly space complexity (memory requirements) and average 
computation (search) time. It is so practical and powerful that optimal 
schedules for most large-scale problems involving a few hundred tasks for 
a total of some ten parallel processors can be determined in several 
seconds to one hundred seconds on an M280H. Optimal solutions could be 
obtained for 75* of the test problems where the upper limit of search time 
was set to 180 seconds [5]. The effectiveness of DF/IHS may be recognized 
by considering the fact that use of dynamic programming could provide 
optimal solutions for small problems with less than 40 tasks even for two 
parallel processors. In the case of parallel processing on a limited 
number of processors, it is known that there exist such task graphs that 
the minimum processing time cannot be attained by data driven execution or 
the list scheduling [12]. For these task graphs, use of DF/IHS can 
determine the optimal schedule that gives rise to the minimum processing 
time by forcing some processors to be idle for a certain time period. This 
fact implies the possibility of more efficient parallel processing than 
data flow machines. In sunmary DF/IHS is very useful when CP/MISF fails 
to obtain an accurate solution for problems with several hundred tasks. 

C. Machine Code Generation 

For the efficient execution on an actual multiprocessor system, the 
optimal machine codes tailored to the given system must be generated by 
using the scheduled results. The scheduled results give us the information 
about tasks to be executed on each processor element, the execution order 
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of tasks on the same processor element, the rough estimates of waiting 
time of the tasks which wait for the data from other tasks assigned to 
other processors, the tasks to be synchronized and so on. Therefore, we 
can generate the machine codes for each processor by putting together the 
codes for the tasks assigned to the processor and attaching the codes for 
synchronization and data transfer among processors. The "version number" 
method is used for the synchronization among tasks. The version number 
corresponds to the number of times of iterations or integration steps. 
Each "writer" task updates the version number on the common memory to the 
number of current integration step for itself after it finishes writing 
the shared data. And each "reader" task checks the version number if the 
number is the same as the number of current integration step to the reader 
task. AH processor elements (PE's) have the same version numbers during 
one integration step and update or increase the number at the end of the 
integration step. Updating the version number on each PE by respective 
PE's allows us to eliminate the need to update the version number (or to 
reset a flag used in test & set or semaphore) attached to each shared data 
on a common memory when the next integration step is started. Therefore, 
the version number method can minimize the frequency of access to the 
common memory for task synchronization in this application. 

We can also optimize the codes to minimize various processing 
overheads by making full use of all information which is obtained as the 
result of static scheduling. For example, the information about task 
assignment and execution order allows the optimized use of the registers 
of the processor when the tasks allocated to the same processor exchange 
data. The optimal use of registers reduces the processing time markedly. 
The knowledge about the estimated waiting time helps prevent the 
degradation of data transfer performance caused by frequent bus access to 
check the existence of the required data (data level synchronization) by 
the waiting task. In other words, if it is estimated that the task must 
wait the data for a long time, the frequency to check a flag on a common 
memory is reduced. In addition, we can minimize the synchronization 
overhead by carefully taking into consideration the information about the 
tasks to be synchronized, the task assignment and the execution order. For 
example, let tasks A, B and C be allocated to processor 1 and tasks D and 
E to processors 2 and 3 respectively as shown in Fig. 4 and data among the 
tasks be transferred via a common memory. Then task B does not need to 
check the flag which shows the completion of task A because both tasks are 
allocated to the same processor. Task E has no need to check the flag 
which indicates the completion of task D because the termination of task D 
has already been confirmed by task C or B. 

In the parallel processing scheme, the transfer of output data of 
integration tasks is not represented on a task graph since data flow 
analysis is performed on the assumption that output data of the 
integration tasks has been given as initial values. In actual processing, 
however, those data must be transferred to several tasks allocated on 
other PE's between the end of an integration step and the beginning of the 
next integration step since, daring one integration step, all the tasks 
except the integration tasks use the output data of the integration tasks 
generated in the previous integration step. The data transfer at one time 
causes bus congestion. In order to prevent the bos congestion, two copies 
of machine codes for each PE which are assigned different data storages 
are generated and executed alternatively for every integration step. 
Generating the two copies of codes allows each integration task in a copy 
of codes to write or transfer its output data, as soon as it completes 
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execution, onto a data storage assigned for the next integration step or 
another copy of codes. In other words, it allows distributed bus access 
and also to eliminate data synchronization to check the completion of the 
integration tasks because the output data of the integration tasks has 
already been transferred before the end of each integration step. 

The optimal machine codes for each PE generated in the way mentioned 
above are loaded to the local instruction memory of each processor element 
and executed asynchronously. The four steps of the proposed parallel 
processing scheme described in this section can be performed automatically 
by a special purpose compiler. 


III. PERFORMANCE EVALUATION ON OSCAR 

This section discusses the performance evaluation of the proposed 
parallel processing scheme on a prototype multiprocessor supercomputing 
system named OSCAR being developed by the authors. 

In the following, as an example of parallel processing of the 
practical dynamic systems simulation for evaluating the performance of the 
proposed scheme on OSCAR, dynamics simulation of a hot strip mill control 
in a steel making plant is treated. The simulation program can be 
represented by a block diagram shown in Fig. 5. In this example, near 
fine task granularity has been chosen in which each integration task 
consists of several floating point operations and the other tasks consist 
of only one floating point operation. By the task generation method using 
near fine granularity, fifty-one tasks involving nine integration tasks 
were generated. Fig. 6 is a task graph generated from Fig. 5 automatically 
by a special purpose compiler. 

OSCAR is a hierarchical multiprocessor system which has a plurality 
of processor clusters as shown in Fig. 7. Its goal is to realize, by the 
combined use of static scheduling and dynamic scheduling, efficient 
parallel processing of Fortran programs and a variety of applications 
including those which have so far been difficult to process efficiently 
because of a lot of scalar assignments involved. 

One processor cluster(PC) hardware has already been completed. On 
the PC, various parallel processing application will be implemented. The 
PC involves sixteen processor elements, three common memories, a local 
control processor and three shared buses. Each PE consists of a 32-bit 
custom-made RISC- like processor with 64 general purpose registers »*ich 
executes all instructions including a few floating point operations in one 
clock (clock: 200ns), a 256-KW local data memory, a 2-W two-port memory to 
communicate with other PE's, two banks of 128-KW instruction memory and a 
DMA controller. The DMA controller realizes high-speed transfer of a block 
of data to the common memories and the two-port memories of other PE's and 
dynamic loading of a set of instruction codes from the common memories to 
one of the instruction memory banks during execution. The reduced 
instruction set and the one- clock- execution of the all instructions make 
the estimation of task processing time for the scheduling easy and 
accurate. For interprocessor conmunicat ion , three types of data transfer 
nodes are provided such as broadcast node, direct data transfer mode to 
the two-port memory of another PE or indirect data transfer mode via a 
common armory Each mode can be used for both single word data transfer 
and block data transfer. Each conmon memory accepts simultaneous accesses 
from three buses. The data transfer speed of the three buses totals to 
60MByte/s. 
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When we operate one PC of OSCAR, a Unix -based workstation is used as 
the host computer which generates Machine codes for each PE by using 
static schedule providing the aininal processing time and downloads the 
codes to each PE. In the generated codes, bus access timing by PE's, data 
transfer modes and use of 64 registers employed to exchange data among 
tasks assigned on the same PE are optimized. Jn addition, redundant task 
synchronization is also eliminated as mentioned before. An timing chart 
representing execution of the machine codes is shown Pig. 8. This chart can 
be regarded as a precise simulated result of actual parallel processing on 
OSCAR. In the figure, for PE3, characters such as "LD30" t "25R" . "wait H , " 
PE5” and so on are written These characters mean to load input data for 
task 30 from a local data memory to registers, execute task 30 and keep 
its result in registers, and wait for a while to directly transfer the 
output data of task 30 to PE5. At that time, PE3 waits for bus access 
since PEI is accessing bus for data broadcasting. In OSCAR, another PE 
cannot access the busses while a PE is broadcasting data. Furthermore "026 
51",” PE2" , "WAIT” , "FC44" and H 38R" represent to execute task 51 by using 
output data of task 26 on registers, transfer its output data to PE2, wait 
for output data of task 44 from PE5, check a flag showing completion of 
data transfer from task 44, execute task 38 and keep its output data on a 
register. 

Fig. 9 shows the measured parallel processing time on OSCAR (solid 
lines) and simulated parallel processing time (dotted lines and chained 
lines) of 51 tasks in Fig. 6. In this example, 4th-order Adams-Bashforth 
method was used. The measured processing time on OSCAR of the near fine 
granularity tasks was reduced from 108.7 us for one PE to 37.2 us (1/2.92) 
for seven PE's. Next , the task fusion technique which generates a coarser 
granularity task by combining several tasks in order to reduce data 
transfer overhead with the minimum loss of parallelism is evaluated. As a 
simple example, those tasks surrounded by dotted lines in Fig. 6 can be 
fused and twenty-two medium granularity tasks are generated automatically. 
Processing time of the medium granularity tasks (after task fusion) 
decreases from 105.8 us for one PE to 36.8 us (1/3.01) for seven PE's. 
From the results, it has been confirmed that the determination of the most 
suitable task granularity is very important and that the automatic task 
fusion is useful. 

The two dotted lines show the simulated processing time. It is clear 
from the figure that there exists little difference between the measured 
processing time and the simulated processing time or an execution image of 
machine codes generated by using static scheduling. In the light of this 
fact, we can conclude that the generat ion of the precisely optimized 
machine code using static scheduling is very useful for OSCAR. 

The processing time shown above, however, represents the degraded 
performance of OSCAR since OSCAR is still in a stage of operation testing. 
Though OSCAR can normally transfer two words data in 5 clocks, the 
processing time were measured in a degraded operating condition where two 
words data transfer takes 9 clocks. Therefore data transfer overhead will 
be reduced by half in the normal operating condition. The chained lines in 
Fig. 9 show the precisely simulated processing times in the normal 
condition for the near fine granularity before task fusion and the sedimi 
granularity after task fusion. The processing time after task fusion 
decreases from 104.8 us for one PE to 28.8 us for seven PE's (1/3.64). 
From the experiment mentioned above, it has been confirmed that OSCAR's 
architecture, especially one clock execution of all instructions and three 
types of data transfer modes, allows us to efficiently parallel process 
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the dynamic systems simulation by extracting the advantageous features of 
static scheduling to the naxiaui extent. 


V. CONCLUSIONS 

In this paper, the authors have proposed a parallel processing scheme 
of the dynamic systems simulation using static optimal multiprocessor 
scheduling algorithms and shown that the scheme allows us to realize 
efficient parallel processing on OSCAR which has been designed to extract 
the advantageous features of static scheduling to the maximum extent. 
More precisely speaking, the special purpose compiler for OSCAR using the 
proposed scheme can generate suitable granularity tasks, the minimal 
execution time schedule and optimized machine codes for each processor in 
which data transfer and synchronization overheads are minimized and the 
registers on each processor are used optimally. 

Furthermore, it has been confirmed that the architectural support in 
OSCAR for a parallelizing compiler using static scheduling is very 
useful. The authors are planning to develop a practical dynamic systems 
simulator using OSCAR which can simulate dynamics of flying objects like 
airplanes and missiles, nuclear reactors, robot systems and various 
industrial plants. 
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Fig-1 Block diagram for Van der Pol 
eq. 

begin 

a= in teg ra 1 (b, 0. 0 1 ) ; ill 
b= i n t e g f a 1 <c , 0. 0 l ) ; (21 

c = d -a ; (3> 

d=g~e; <d> 

e = f * g ; ( 5 i 

f = a*a ; (61 

g = b*l (71 

end. 

Fig. 2 Assignment statements for Van 
der Pol eq. . 
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^ Precedence 
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FS Flag set 

FC Flag check 

CD Unnecessary 


Fig. 4 Miniaizat ion of synchronization 
overhead . 


Fig. 3 Task graph for Van der Pol eq. 














Fig. 8 Execution image of machine 
codes on OSCAR. 
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Abstract. A general method for parallel and vector numerical solutions of stochastic 
dynamic programming problems is described for optimal control of general nonlinear, con- 
tinuous time, multibody dynamical systems, perturbed by Poisson as well as Gaussian ran- 
dom white noise. Possible applications include lumped flight dynamics models for uncertain 
environments, such as large scale and background random atmospheric fluctuations. The 
numerical formulation is highly suitable for a vector multiprocessor or vectorizing super- 
computer, and results exhibit high processor efficiency and numerical stability. Advanced 
computing techniques, data structures, and hardware help alleviate Bellman’s curse of di- 
mensionality in dynamic programming computations. 

1. Introduction. The primary motivation for this research is to provide a provide 
a general computational treatment of stochastic optimal control applications in continuous 
time. In addition, fast and efficient methods are being developed by the optimization of 
stochastic dynamic programming algorithms for larger multibody problems. The optimiza- 
tion will help alleviate Bellman’s curse of dimensionality, in that the computational problem 
greatly increases as the dimension of the state space increases. Optimization consists of par- 
allelization and vectorization techniques to enhance performance on advanced computers, 
such as parallel processors and vectorizing supercomputers. General Markov random noise 
in continuous time consists of two kinds, Gaussian and Poisson. Gaussian white noise, being 
continuous but nonsmooth, is used to model background random fluctuations, such as tur- 
bulence and external field variations. Poisson white noise (its frequency spectrum is also flat 
like Gaussian noise), being discontinuous, is useful for modeling large random fluctuations, 
such as shocks, colb si on s, unexpected external events and large environmental changes. Our 
general feedback control approach combines the treatment of both linear and nonbnear (i.e., 
singular and nonsingular) control through the use of small and non- small quadratic costs. 
The methods also handle deterministic and stochastic control in the same code, making it 
convenient for checking the effects of stochasticity on the application. Some actual apph- 
cations are models of resources in an uncertain environment [15], [11], [8]. Some potential 
applications are flight dynamics under random wind conditions [2] and other resource models 
[ 12 ]. 

The Markov, multibody dynamical system is illustrated in Figure 1 and is governed by 
the stochastic differential equation (SDE): 

dy(s) = F(y,s,u)ds + G(y,s)dW(s) + H(y,s)dP(s) , (1.1) 
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Feedback in time dt f 

Figure 1: The multibody dynamical system. 


y{t) = x ; 0 < t < s < t f \ y(s) e V y ; u G V u , 

where y{s) is the m x 1 multibody state vector at time s starting at time £, u — u(y,,s) is 
the n x 1 feedback control vector, W is the r-dimensional normalized Gaussian white noise 
vector, P is the independent g-dimensional Poisson white noise vector with jump rate vector 
[A;] gx i, F is the m x 1 deterministic nonlinearity vector, G is an m x r diffusion coefficient 
array, and H is an m x q Poisson amplitude coefficient array. 

The control criterion is the optimal expected cost performance, 

V(x,t) = min [MEAN F< w [V[y,i,u,P,W] | y(t) = x]] , (1.2) 

over some specified optimal control set T> U) where the total cost is 

F[y,*,u,P,W] = ds C(y(s),s,u(y(s),s)) , (1.3) 

on the time horizon ( t , t/). In (1.3), the instantaneous cost function C = C(x.,t, u) is 
assumed to be a quadratic function of the control, 

C(x,t,u) = C 0 (x,t) + Cf(x,t)u + |u T C' 2 (x,t)u . (1.4) 

The unit cost of the control increases with u when C 2 is positive definite. For example, the 
cost criterion could be minimal fuel consumption, minimum distance to target or minimum 
time to target. No final salvage value is assumed at final time, so V is zero at t — tf. 

In addition, the deterministic, nonlinear dynamics in (1-1) are assumed to be linear in 
the controls, 


F(x,f,u) = F 0 (x,t) + Fi(x, t)u , 


(1.5) 
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but nonlinear in the multibody state variable x. 

For numerical purposes, it is more convenient to convert equations (1.1 )-( 1.2) to an 

effectively deterministic partial differential equation using Bellman’s of optimality as illus- 
• ^ 
trated in the optimization step from optimal control vector U to optimal expected costs 
* 

V in Fig. 2. The Bellman functional PDE of stochastic dynamic programming. 



Figure 2: The optimization step from controls to costs. 


0 - v; + H[V] = v; + F*VV* + |GG T (x,f):VV T F* 

+ E V [ V(X + H,(x,t),t) - F*(x,f) ] (1.6) 

Z=1 

+ Co + (1U*-U h ) t C 2 U* , 

follows from the generalized ltd chain rule for Markov SDEs as in [7] and [15], where U* is 
the optimal feedback control computed by constraining the unconstrained or regular control, 

U*(x,0 = -C7 l (Ci + F/W), (1.7) 

to the control set T) u . In general, the Bellman equation (1.6) is nonlinear with discontinu- 
ous coefficients due to the last term, (|U — U H ) T C 2 U*, in (1.6) and due to the compact 
relationship between the constrained, optimal control and the unconstrained, regular control, 

* 

U t (x,f) = minfl/mBx.i, minff/min.o U R< i(x , £)]], (1.8) 

for i = 1 to n controls, where U mn is the minimum control constraint vector and U max 

j|j 

is the maximum. As the constraints are attained, the optimal control U , changes from 
the regular control, U H , to the single bang control values, U min or U max , which in general 
are functions of state and time. In (1.6), the symbol (:) denotes the scalar matrix product 
A : B — jyjLi AijBij , assuming B is symmetric. It is important to note that the 
principal equation, the Bellman equation (1.6), is an exact equation for the optimal expected 
value V* and does not involve any sampling approximations such as the use of random number 
generators in simulations. 
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Since there is no final salvage value and since (1.6) is a backward equation (unlike the 
usual diffusion equation, which is a forward equation), the final condition is that = 

0 using (1.2) and (1.3). On the other hand, boundary conditions for the PDE of stochastic 
dynamic programming (1.6) are not as simple or as straightforward to state. This is because 
the boundary conditions vary significantly with the form the deterministic linearity function 
F, the Gaussian noise W , and the Poisson noise P . Thus for treatment of general boundary 
conditions, it is most practical to directly integrate (1-6) for the special values of x y or to use 
the objective functional directly as defined in (1.2) and (1.3). The problem with boundary 
conditions is also present in stochastic application in continuous time, even when there is no 
control variable or optimization in the problem. 

As the number of multibody state variables, m, increases, the spatial dimension rises, 
and computational difficulties are present that can compare to those of three-dimensional 
fluid dynamics computations. This is the famous Bellman’s curse of dimensionality [3]. Thus 
there is a great need to make use of advanced-architecture computers, to use parallelization 
as well as vectorization. The Panel on Future Directions in Control Theory [6] stresses the 
importance of making gains in such areas as nonlinear control, stochastic control, optimal 
feedback control and computational methods for control. This paper is a preliminary report 
on our efforts to treat all of the above mentioned areas combined from the computational 
point of view. 


2* Numerical Methods. The integration of the Bellman equation (1.6) is backward 
in time, because V* is specified finally at the final time t = tf , rather than at the initial 
time. A summary of the discretization in state and backward time is given below: 


Xj = [x ijt .] mxl = [X a + U - i)-^u , 

[jijmxi , where j { = 1 to M; , for * = 1 to m ; 
T k = t f - (k - 1) • DT , for k = 1 to K ; 


( 2 . 1 ) 


V (Xj,r fc ) — L[V ](Xj,T i) 




where DX{ is the mesh size for state i and DT is the step size in backward time. 

The numerical algorithm is a modification of the predictor corrector, Crank Nicolson 
methods for nonlinear parabolic PDEs in [5]. Modifications are made for the switch term 
and delay term calculations. Derivatives are approximated with an accuracy that is second 
order in the local truncation error, at all interior and boundary points. The Poisson induced 
functional or delay term, V (x + H/ } t), changes the local attribute of the usual PDE to a 
global attribute, such that the value at a node [X + H/]j will in general not be a node. Linear 
interpolation, with special handing of point near the boundaries, maintains the numerical 
integrity compatible with the numerical accuracy of the derivative approximations. Even 
though the Bellman equation (1.6) is a single PDE, the process of solving it not only produces 

the optimal expected value V , but also the optimal expected control law U . This is 
because the PDE is a functional PDE, in which the computation of the regular control is fed 
back into the optimal value and the optimal value feeds back into regular control through 
its gradient. The nonstandard part of the algorithm is to decompose this tightly coupled 
analytical feedback so that both the value and the control can be calculated by successive 
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iterations, such that each successive approximation of one improves the next approximation 
of the other. While our procedure may look superficially like a standard application of finite 
differences, it is not due to the nonstandard features mentioned above. For these reasons, 
we are not aware of any other successful stochastic dynamic programming code that treats 
anywhere near the generality of applications that we treat. Variations of this algorithm have 
been successfully utilized in [15] and [ 8 ]. 

Prior to calculating the values, Vj ifc+1 , at the new (/c-f 1 )** time step for k = 1 to K - 1 , 
the old values, Vj ifc and Vj,jt-i, are assumed to be known, with v» = Vj x . The algorithm 
begins with an extrapolator (x) start: 


v {x) x 

j.fc +2 


1(3 -V. 






( 2 . 2 ) 


which are then used to compute updated values of the gradient of V , the second order 
derivatives, Poisson functional terms (V at (x + H)), regular controls U*, optimal controls 
U*, and finally the new value of the Bellman equation spatial functional Lj^+o.h- The 
extrapolation step greatly speeds up the convergence of the corrector step, except at the 
initial step. These evaluations are used in the extrapolated predictor (xp) step : 


t/( x P) w, i jyT . I r( x ) 

v s*+' - ‘ + DT -iVl • 

which are then used in the predictor evaluation (xpe) step: 


i/(xpe) _ i/x/xp) 

jjk+i ^ j’ fc+i 


+ 


Vi*), 


(2.3) 


(2.4) 


an approximation which preserves numerical accuracy and which is used to evaluate all terms 
comprising Lj k+os . The evaluated predictions are used in the corrector (xpec) step: 


T/ (x pec, 7 + 1) _ t, i nr r( x P e >7) 

V j,k+i ~ v i* + U1 L ik+ l 


(2.5) 


for 7 = 0 to 7 max while stopping criterion unmet, with corrector evaluation (xpece) step: 


y(xpece,7 + 1) 

j .^+2 


in/(xp ec 7 + 1) 

2 V K j ,fc+l 




( 2 . 6 ) 


The predicted value is taken as the zeroth correction. The stopping criterion for the correc- 
tions is a heuristically motivated comparison to a predictor corrector convergence criterion 
for a linearized, constant coefficient PDE [13]. The stopping criterion is computed with 
a robust mesh selection method, so that only a few corrections are necessary. The selec- 
tion of the mesh ratio, the ratio of the time step DT to the norm of the space or state 
step DX, guarantees that the corrections will converge whether the Bellman equation (1.6) 
is parabolic-like (with Gaussian noise) or hyperbolic-like (without Gaussian), according to 
whether or not an explicit second derivative is in the equation. 

Parallelization and vectorization of the algorithm was done by what might be called the 
“Machine Computational Model Method,” i.e., tuning the code to optimizable constructs 
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that are automatically recognized by the compiler, with the Alliant FX/8 vector multi- 
processor [1] in mind. All inner double loops were reordered to fit the Alliant concurrent 
outer - vector inner (COVI) model. All non-short single loops were made vector-concurrent 
Short loops became scalar-concurrent only. Multiple nested loops were reordered with the 
two largest loops innermost, A total of 37 out of 39 loops was optimized. Detailed results 
for a two-state and two-control model with Poisson noise are reported in [9]. Very similar 
techniques work for the vectorizing Cray supercomputers, except that only inner loops are 
vectorized. Vectorizing and parallelizing techniques are very similar, because vectorization 
is really a primitive kind of parallelization and because both are inhibited by many of the 
same types of data dependencies. 

The relative performance of column oriented versus row oriented code is discussed in 
[10]. Dongarra, Gustavson, and Karp [4] have demonstrated that loop reordering gives 
vector or supervector performance for standard linear algebra loops on a Cray 1 type column 
oriented FORTRAN environment with vector registers. However, for the stochastic dynamic 
programming application, the dominant loops are non-standard linear algebra loops, so that 
the preference for column oriented loops is not a rule, as demonstrated on the Alliant vector 
multiprocessor [10]. 

Current efforts are concentrated on implementing the code on the Cray X-MP /48 and 
Cray 2 for more general multi-state and multi-control applications. In order to implement 
the code for arbitrary state space dimension, a more flexible data structure is needed for the 
problem arrays, F, G and H, as well as for the solution arrays, V along with its derivatives 
and the control U. In the straight-forward, original data structure, an array like the non- 
linearity vector requires one index, js(is), to denote a numerical node for each state variable 
is: 


F(is,js(l),js(2), ■ • • ,js(m)) (2.7) 

for each state equation, is = 1 to m. It is assumed that there are a common number 
M — Mi = • • • = Mm of nodes per state, so that js(is) = 1 to M for is = 1 to m states. 
As a consequence, the typically dominant loops containing the nonlinearity function F, the 
solution gradient DV or similarly sized array are nested to a depth of at least m + 1. A 
typical loop has the form 

do 1 i = 1, m 

do 1 jl = 1, M 


do 1 jm = 1, M 

1 F(i,jl,j2, - • -,jm) 

This state size dependent loop nest depth level of m -F 1 inhibits the development of general 
multibody algorithms, especially when the state size m is incremented and the number of 
loops in each nest have to be changed. Also, vectorization is inhibited for compilers that 
vectorize only the most inner loop. Parallel and vector optimization is important, due to the 
size of the work load, which is 0(m ■ M m ), for the dominant loop illustrated above. As the 
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number of states grows the computational load will grow like some multiple of 

m ■ M m = m • 

i.e., the load grows exponentially in the number of states m. This exponential growth is 
merely a quantitative expression of Bellman’s curse of dimensionality. 

One way around this inhibiting structure (2.7) is to use a vector data structure: 

FV(isJv) (2.8) 

for the nonlinearity vector as an example, such that all the numerical nodes are collected 
into a single vector indexed by the global state index jv, where jv = 1 to M m over all state 
nodes. Assuming that the number of nodes per state are fixed at M , then for a fixed set of 
state node indices {js(1),js(2), ■ • • ,js(m)}, the global state vector index is computed from 
the direct mapping formula 


> = ]£(j*(0 - 1) • M l 1 + 1, (2.9) 

i=l 

in the case of fixed state mesh size, Mi = M for all states i. 

Both the direct mapping from the original data structure to the vector data structure 

and the inverse mapping are needed to compute the amplitude functions, F, G and H, as 

* 

well as the derivatives of V , because these quantities depend on the original formulation. 
The pseudo-inverse of the vector index in (2.9) can be shown to permit the recovery of the 
individual state indices by way of integer arithmetic: 

m 

js(is\jv) = 1 + [jv - 1 - Y, ~ 1) * AT*' -1 ]/#”- 1 , (2.10) 

i=i§+l 

recursively, for is = m to 1, by back substitution, with a* = 0. The vector data 

structure of (2.8) to (2.10) results in major do loop nests of the order of 1 to 2, rather than 
order of m + 1. A typical vector data structure loop has the form 

do 2 i = 1, m ! parallel loop. 

do 2 jv = 1, M * *m ! vector loop. 

2 FV(ijv) 

resulting in a reduction of the loop nest depth from m + 1 to 2, independent of the number 
of states m. Preliminary implementation of the vector data structure is available on the 
Alliant multiprocessor and on the Cray X-MP/48. 

One major disadvantage of the vector data structure given in (2.10) is that the largest 
degree of parallelism available to a parallel processor or multiprocessor in the most outer 
or state number loop is m, the number of states. This task load can be better scheduled 
on parallel processors by block decomposition or strip mining of the vector data structure 
loop in the index iv, so that the single inner loop is split into two evenly balanced loops (cf., 
Polychronopoulos [14]). Thus, dividing the vector data structure into blocks can enhance 
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parallelism. Let MBLK be the number of state nodes in each block and then the total 
number of blocks will be 


NBLK = M m / MBLK, 

assumed to be an integer for simplicity. Consequently, the blocked version of the typically 
dominant loop will have the form 

do 3 i = 1, m 

do 3 jblk = 1, MBLK ! parallel loop, 
jvl = 1 + MBLK*(jblk - 1) 
jv2 = MBLK*jblk 
do 3 jv = jvl, jv2 ! vector loop. 

3 FV(iJv) = 

This form should result in better parallel optimization when there are more than m available 
parallel processors. 

The advantages of the algorithm is that it 1) permits the treatment of general continuous 
time Markov noise or deterministic problems without noise in the same code, 2) maintains 
feedback control, 3) permits the cheap control limit to linear singular control to be found 
from the same quadratic cost code, 4) stable mesh selection can be used to control the 
number of corrector steps, and 5) produces very vectorizable and parallelizable code whose 
performance is described in the next section. 

3. Results and Discussion. The stochastic dynamic programming code arose from 
renewable resource modeling problems of Hanson and co-worker Ryan, with various one-state, 
one-control models treated in [15] and [11]. Two-state, two-control models were treated 
by Hanson [8]. In the two-state model [8], the two controls represent removals from the 
system by respective commercial and recreational users of the system. Poisson noise is 
used to represent natural catastrophic events. Applications to aerospace problems only 
entails modification of the dynamical system and performance criteria input by appropriate 
aerospace input functions and parameters. 

The dynamic programming code has been optimized for parallelization and vectoriza- 
tion [9] using Hanson’s two-state model [8] as a test example, and the Alliant FX/8 vector 
multiprocessor as the advanced hardware. The Alliant FX/8 at the Advanced Computing 
Research Facility (ACRF) at Argonne National Laboratory was used for benchmarking the 
code. This Alliant FX/8 has eight vector computing elements (CEs). Each of the CEs has 
eight vector registers whose length is 32 eight-byte elements, and the CEs are connected 
to a 128 KB cache. Some automatic parallelization and vectorization is performed, but 
significant increases are still attainable by the removal of optimization hindering data de- 
pendencies. Benchmark performance was measured for many mesh sizes and on all processor 
configurations. Almost all loops were of the highly optimized parallel and vector type for the 
Alliant. Over 65% efficiency was achieved over a wide range of tests [9]. The temporal mesh 
was chosen to be about four times more refined than the spatial mesh, K = 4 • (M — 1) + 1, 
for a fixed number of spatial nodes M and for constant numerical stability conditions. In 
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addition, vector stride effects (resonance effects related to multiples of the vector register 
length of 32 on the FX/8) were found with non-standard performance in both column and 
row referencing environments [10]. 

The present results have been obtained for a three-state, three-control modification 
of Hanson’s two-state resource model [8] and by implementing the vector data structure 
mentioned above. The present application contains a new interacting state with competition. 
The present code is in a form where it is much more convenient to change the application, 
the advanced computer intrinsics, and the number of states. 

Table 1 compares the performance of the code on the ACRF Alliant FX/8 vector multi- 
processor at Argonne National Laboratory, the NCSA Cray X-MP/48 vector supercomputer 
at Urbana, and the University of Illinois at Chicago IBM3081K as a scalar uniprocessor 
reference. The Cray X-MP/48 is a four processor pipelined vector multiprocessor, but the 
use of the X-MP is much more costly to use in parallel than the Alliant and so only single 
processor results are reported here for the X-MP. The Cray executing on one vector processor 
outperforms the Alliant using either one vector processor or the full eight vector processors, 
due to the more powerful pipelined processing unit on the Cray. The advantages of block de- 
composition with MBLK = 32 for eight Alliant processors are illustrated in the table, where 
the eight processor time has been reduced from about 52 to 33 seconds when M ~ 16, while 
the one processor time has increased dramatically for the block method. The IBM3081K 
scalar uniprocessor is much slower when M — 8 unblocked spatial nodes than any of the 
Alliant or Cray values at M — 8. However, as the spatial mesh size is refined to M — 16 
spatial points, with a corresponding increase in work load, the IBM3081K performs between 
the one and eight processor Alliant, but still significantly below the CRAY performance. 


Table 1: Comparative Performance of IBM 3081K, Alliant and Cray, 

for three state model. 


Nodes 

Method 

IBM 3081K 

Alliant FX/8 

Cray X-MP 

state 

time 


vs fortran, opt (3) 

fortran -0 

eft 7 7 

M 

K 


V = 1 

p = i 

P = 8 

p = 1 

8 


unblocked 

38.513 

8.653 

2.980 

0.144 



unblocked 

85.377 

147.391 

51.619 

2.058 

8 

29 

blocked 

— 

13.693 

1.998 

— 

16 


blocked 

— 

223.426 

32.729 

— 


The performance of the stochastic programming code under parallel and vector oper- 
ation is investigated in more detail on the ACRF Alliant FX/8, which has better parallel 
capability than the Cray X-MP/48. The Cray X-MP/48 is also a vector multiprocessor, but 
the multiprocessing features are not as easily accessed as on the Alliant, where paralleliza- 
tion is more transparent. In Figure 3, the blocked and unblocked code is compared on the 
Alliant FX/8 with time T(p) plotted against the number of processors p. The unblocked 
code runs faster as the number of processors increases from one, but then ceases to run any 
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faster beyond p = 3 processors due to the fact that the maximum parallelism available is 
the three iterations in the three-state outer loop. The blocked code, using a block size of 
MBLK = 32 (the vector register length on the Alliant) runs faster the more processors used 
out of the eight vector processors. However, the unblocked code is faster for p < 5, but 
slower for p > 5. The trade-off point between the blocked and unblocked code is p = 5, with 
the block overhead slowing down the code for p < 5, but the benefit of parallelism is found 
for p > 5. 

Figure 4 shows the speedup, S(p) = T(l)/T(p), versus the number of processors p. 
The unblocked code clearly exhibits a speedup plateau for p > 3 and the unblocked code 
exhibits nearly ideal speedup, S(p) — p for all p. However, this figure illustrates the danger 
of comparing speedups, because the unblocked case is better for p < 5 as demonstrated in 
Figure 3. In Figure 5, the efficiency, E(p) = S(p)/p or speedup per processor, versus the 
number of processors p is shown. Again, the blocked efficiency is much higher than the 
unblocked efficiency, independent of the actual performance. 

4. CONCLUSIONS. Stochastic dynamic programming algorithm can be optimized to 
permit numerical solution of larger state space problems using vector multiprocessors. In 
order to handle a large number of state variables, a large number of parallel processors 
would be desirable, but Bellman’s curse of dimensionality appears to very much weakened. 
Parallelization, vectorization, and general supercomputing are important in the solution 
of the larger problems. Robust mesh selection techniques are necessary to achieve stable 
algorithms. These techniques are generally applicable to other vector and parallel computers. 
The general code is valid for general Markov noise in continuous time, feedback control, 
nonlinear dynamics, nonlinear control and the cheap control limit. 

Future directions include applications to aerospace problems, improved development of 
general code for an arbitrary number of state variables, enhanced code portability, exten- 
sions to Kalman filtering for imperfect observations, and optimization for other advanced 
architectures. 
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Figure 4: Speedups for blocked (blk) and unblocked (unblk) versions of the code. 
Speedup is denoted by S(p) — T(l)/T(p) and p is the number of processors. The notation 
(ideal) denotes the ideal case, S(p) — p . Results are for m — 3 states, M = 16 spatial nodes 
and K = 61 temporal nodes. 
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Figure 5: Efficiency for blocked (blk) and unblocked (unblk) versions of the code. 
Efficiency is denoted by E{p) = S(p)/p and p is the number of processors. The notation 
(ideal) denotes the ideal case, E(p) = 1. Results are for m = 3 states, M = 16 spatial nodes 
and K = 61 temporal nodes. 
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Multiprocessor Machine 
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Abstract 

A parallel processing scheme for a single chain robot arm is presented 
for high speed computation on a shared memory multiprocessor. A 
recursive formulation that is derived from a virtual work form of the 
d'Alembert equations of motion is utilized for robot arm dynamics. A joint 
drive system that consists of a motor rotor and gears is included in the arm 
dynamics model, in order to take into account gyroscopic effects due to the 
spinning of the rotor. The fine grain parallelism of mechanical and control 
subsystem models is exploited, based on independent computation 
associated with bodies, joint drive systems, and controllers. Efficiency and 
effectiveness of the parallel scheme are demonstrated through simulations 
of a telerobotic manipulator arm. Two different mechanical subsystem 
models, i.e., with and without gyroscopic effects, are compared, to show the 
trade-off between efficiency and accuracy. 
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A Unifying Framework for Rigid Multibody Dynamics and 
Serial and Parallel Computational Issues 

Amir Fijany and Abhinandan Jain 
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Abstract 

In this paper we present a unifying framework for various formulations 
of the dynamics of open-chain rigid multibody systems and assess their 
suitability for serial and parallel processing. The framework is based on 
the derivation of intrinsic, i.e., coordinate-free, equations of the algorithms 
which provides a suitable abstraction and permits a distinction to be made 
between the computational redundancy in the intrinsic and extrinsic 
equations. A set of spatial notation is used which allows the derivation of 
the various algorithms in a common setting and thus clarifies the 
relationships among them. The three classes of algorithms viz., Oin), Oin 2 ) 
and Oin 5 ) or the solution of the dynamics problem are investigated. We 
begin with the derivation of Oin 5 ) algorithms based on the explicit 
computation of the mass matrix and it provides insight into the underlying 
basis of the Oin ) algorithms. From a computational perspective, the optimal 
choice of a coordinate frame for the projection of the intrinsic equations is 
discussed and the serial computational complexity of the different 
algorithms is evaluated. The three classes of algorithms are also analyzed 
for suitability for parallel processing. It is shown that the problem belongs 
to the class of N C and the time and processor bounds are of O(log|<n)) and 
Oin 4 ), respectively. However, the algorithm that achieves the above bounds 
is not stable. We show that the fastest stable parallel algorithm achieves a 
computational complexity of Oin) with Oin 4 ) , respectively. However, the 
algorithm that achieves the above bounds is not stable. We show that the 
fastest stable parallel algorithm achieves a computational complexity of Oin) 
with Oin 2 ) processors, and results from the parallelization of the Oin 5 ) serial 
algorithm. 
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Abstract- In this paper parallel computation of manipulator forward dynamics 
is investigated. Considering three classes of algorithms for the solution of 

the problem, that is, the 0(n), the 0(n ), and the 0(n ) algorithms, 
parallelism in the problem is analyzed. It is shown that the problem belongs 

to the class of NC and that the time and processors bounds are of O(log,n) and 

0(n 4 ), respectively. However, the fastest stable parallel algorithms achieve 
the computation time of 0(n) and can be derived by parallelization of the 

0(n 3 ) serial algorithms. Parallel computation of the 0(n ) algorithms requires 
the development of parallel algorithms for a set of fundamentally different 
problems, that is, the Newton-Euler formulation, the computation of the 
inertia matrix, decomposition of the symmetric, positive definite matrix, and 
the solution of triangular systems. Parallel algorithms for this set of 
problems are developed which can be efficiently implemented on a unique 
architecture, a triangular array of n(n+l)/2 processors with a simple 
nearest-neighbor interconnection. This architecture is particularly suitable 
for VLSI and WSI implementations. The developed parallel algorithm, compared 
to the best serial 0(n) algorithm, achieves an asymptotic speedup of more than 
two orders-of-magnitude in the computation the forward dynamics. 


I. INTRODUCTION 

The manipulator forward dynamics problem concerns the determination of 
the motion of the mechanical system resulting from the application of a set of 
joint forces/torques which is essential for dynamic simulation. The motivation 
for devising fast algorithms for forward dynamics computation stems from 
applications which require extensive off-line simulation as well as 
applications which require real-time dynamic simulation capability. In 
particular, for many anticipated space teleoperation applications, a faster- 
than-real-time simulation capability will be essential. In fact, in the 
presence of unavoidable delay in information transfer, such a capability would 
allow a human operator to preview a number of scenarios before run-time (ll. 

The forward dynamics problem can be stated as follows: Given the vectors 
of actual joint positions (Q) and velocities (Q), the external force (f £ ) and 

moment (n ) exerted on the End-Effector (EE), and the vector of applied joint 
E 
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forces/torques (x), find the vector of Joint accelerations (Q). Integrating 
the vector of joint accelerations leads to the new values for Q and Q, and the 
process is then repeated for the next x. The first step in computing the 
forward dynamics is to derive a linear relation (for the given manipulator 
configuration described by the vector of joint positions) between the vector 
of Joint accelerations and the vector of applied inertial forces/torques. 

Given the dynamic equations of motion as 
A(Q)Q + C(Q, Q) + G(Q) + /((DF = T (1) 

and defining the bias vector b as 

b = C(Q.Q) + G(Q) + /(Q)F e (2) 

the linear relation Is derived: 

A(Q)*Q = x-b = T (3) 

• • • 

where Q, Q, and Q are nxl vectors and F , a 6x1 vector, is a combined 

representation of f £ and n^.. A (Q ) is an nxn symmetric, positive definite, 

inertia matrix, and J is the 6xn Jacobian matrix (t denotes matrix transpose). 
The bias vector b represents the contribution due to coriolis and centrifugal 
terms C(Q,Q), gravitlonal terms G(Q), and the external force and moment. 

Hence, in Eq. (3), T is the nxl vector of applied inertia forces/torques. The 
bias vector b can be obtained by solving the inverse dynamics problem, using 
the Newton-Euler (N-E) formulation [2], while setting the vector of joint 
accelerations to zero. The computation of the vectors b and T represent the 
common first step in any algorithm for solving the forward dynamics problem. 

The proposed algorithms for the forward dynamics problem differ in their 
approaches to solving Eq. (3), which directly affects their asymptotic 
computation complexity. These algorithms can be classified as: 

1) The 0(n) algorithms [3]-[6] which, by taking a more explicit advantage of 
the structure of problem, e.g., by using the Articulated-Body Inertia [ 3] — [ 4 ] 
and recursive factorization and inversion of the inertia matrix [ 5 ] - [ 6 ] , solve 
Eq. (3) in 0(n) steps without explicit computation and inversion of the 
inertia matrix. 

2) The 0(n ) conjugate gradient algorithms [7, 10] which iteratively solve 
Eq. (3) without explicit computation and inversion of the inertia matrix. The 
conjugate gradient algorithm is guaranteed to converge to the solution in at 
most n iterations which, given the 0(n) computational complexity of each 

2 

iteration, leads to an overall 0(n ) computational complexity. 

3 

3) The 0(n ) algorithms [7] which solve Eq. (3) by explicit computation and 
inversion of the inertia matrix, leading to an 0(n ) computational complexity. 

However, any analysis of the relative efficiency of these algorithms 
should be based on the realistic size of the problem, i.e., the number of 
Degree-Of -Freedom (DOF), rather than the asymptotic complexity. In fact, the 

comparative study in [3] -[4] shows that the 0(n 3 ) Composite Rigid-Body 
algorithm is the most efficient for n less than 12. It should be pointed out 
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that efficiency of the 0(n 3 ) and 0(n 2 ) algorithms has been recently Improved 

3 

19)— [10] . However, despite these Improvements, even the fastest 0(n ) 
algorithm is far from providing the efficiency required for real-time or 
faster-than-real-tlme simulation. This observation clearly suggests that the 
exploitation of a high degree of parallelism in the computation is the key 
factor in achieving the required efficiency. 

The analysis of the efficiency of the different algorithms for parallel 
computation is more complex than that for serial computation. In the next 
section, the three classes of algorithms are analyzed based on their 

3 

efficiency for parallel computation and it is shown that the 0(n ) algorithms 
are also the most efficient for parallel computation. However, parallelization 

of the 0(n 3 ) algorithms represents a challenging problem since it requires the 
development of parallel algorithms for computation of a set of fundamentally 
different problems, i.e., the N-E formulation, the inertia matrix, the 
factorization of the inertia matrix, and the solution of triangular systems. 

Lee and Chang [15] were first to investigate the computation of the 

forward dynamics by parallelization of the 0(n 3 ) algorithms. Considering an 
SIMD architecture with n processors interconnected through a generalized-cube 
network, a modified version of their 0(log 2 n) algorithm in [16] and an 0(n) 

parallel version of the Composite Rigid-Body algorithm were developed for 

2 

computation of the N-E formulation and the inertia matrix. A parallel 0(n ) 
Cholesky algorithm and the 0(n) Column-Sweep algorithms were also proposed for 
the factorization of the inertia matrix and the solution of the resulting 

triangular systems, leading to an 0(n 2 ) complexity of the overall computation. 
However, the main drawbacks of the proposed algorithms reside in the 

2 

complexity of the required interconnection network and the 0(n ) communication 
complexity which mainly results from the excessive data alignment needed for 
different algorithms. 

In this paper, we present a set of efficient parallel algorithms for the 

computation of the forward dynamics, using the 0(n ) algorithms, which can be 
implemented on a two-dimensional array of n(n+l)/2 processors with a nearest 
neighbor interconnection. The overall of communication complexity, even with 
such simple interconnection structure, is limited to 0(n) and no additional 
data alignment between the computation of the different algorithms is 
required, which further reduces the overhead in the parallel computation. 

A new algorithm for computation of the inertia matrix is developed which, 
though not efficient for serial processing, achieves the best performance for 
parallel computing in terms of both computation and communication complexity 
while demanding simple architectural features for its implementation. The 
parallel algorithm for computing the inertia matrix achieves the time lower 
bound of 0(log 2 n)+0(l) on the processor array. Synchronous data-flow parallel 

algorithms are also developed for factorization of the inertia matrix and the 
solution of the resulting triangular systems on the processor array. 
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This paper is organized as follows. In Section II, parallelism and time and 
processors bounds in the computation of the forward dynamics are investigated. 

In Section III, parallel algorithm for computation of the inertia matrix is 
developed. In Section IV, parallel computation of the bias vector and the 
linear system solution are briefly discussed. Finally, some concluding remarks 
are made in Section V. 

II. PARALLELISM IN FORWARD DYNAMICS COMPUTATION 

A. Time and Processor Bounds in the Computation 

The analysis of time and processors bounds in parallel computation of a 
given problem is of fundamental theoretical importance. It can determine the 
inherent parallelism in the problem and the bound on the number of processors 
required for exploiting maximum parallelism and achieving the time lower 
bound in the computation. However, besides the theoretical importance, it can 
also provide, as is the case for forward dynamics problems, useful insights 
into devising more practical and efficient parallel algorithms (in the sense 
of both computation time and number of processors) for the problem. 

Let P denote the class of problems that can be solved sequentially in a 
time bounded by a polynomial of the input size, n. Also, let NC (for Nick s 
Class" [18]) stand for the class of problems that can be solved in parallel 

in a time of O(log k n), for some constant k, with a number of processors 
2 

bounded by a polynomial of n. One open question regarding the complexity of 
parallel algorithms is whether P = NC, which is thought to be very unlikely 
[19]. It is clear that NC £ P. For k = 1, the time of 0( log 2 n)+0( 1 ) represents 

the natural time lower bound in the computation. However, most of the 
kinematic and dynamic problems in robotics belong to the class of NC [8], 
Furthermore, it is possible to devise parallel algorithms which achieve the 
time lower bound of O(logn)+O(l) in solving these problems [8,14,16,17]. In 

the following, we study the time and processors bounds in the computation of 
the forward dynamics by different algorithms. 

Using the N-E formulation, the bias vector can be computed in a time of 

O(log n)+0(l) with 0(n) processors 115]— [16] . This implies that the time and 
2 

processors bounds in the forward dynamics computation are determined by those 
in the solution of Eq. (3). Note that, with 0(n) processors, the integration 
of the computed joint accelerations can be performed in a time of 0(1). 

The solution of Eq. (3) by the 0(n) algorithms results in a set of first- 
order nonlinear recurrences which can be represented (at an abstarct level) as 

X, = c , * V x ... ,/ V x m ) - c . * * (x ,..> (4) 

where C is constant, <p and <p^ are polynomials of first and second degree, 

and deg <f> = max (deg <p , deg (pj =2. It is well-known that, regardless of the 

number of processors, the computation of nonlinear recurrences of the form of 
Eq. (4) and with deg <p>l can be speeded up only by a constant factor 
[20 ] - [ 21 ] . This is due to the fact that the data dependency in nonlinear 
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recurrences and, particularly, those containing division, Is stronger than in 
linear recurrences [22]. Hence, the parallelism In the 0(n) algorithms is 
bounded, that Is, their parallelization leads to the 0(n) algorithms which are 
faster than the serial algorithm only by a constant factor. Note that a rather 
simple model was used for presentation of the nonlinear recurrences of the 
0(n) algorithms while they are far more complex than those usually studied 
in literature, e.g., in [21]— [22 ] (see [8] for a more detailed discussion). 

For the conjugate gradient algorithms in [7], [10], the computation of 
each iteration, as is shown in [15], can be done in a time of 0(log n) with n 

processors, leading to the 0(nlog 2 n) parallel algorithms. This implies that 

the parallelism in conjugate gradient algorithm Is unbounded. Asymptotically, 
however, the parallel conjugate gradient algorithms are slower than the best 
serial algorithms, the 0(n) algorithms, for the solution of the problem. 

The inertia matrix can be computed in 0(log n)+0(l) steps with 0(n 2 ) 
processors [8], [11], [13], The implication of this result is that it further 

reduces the analysis of the time and processors bounds in the forward dynamics 
problem to that in a more generic problem, the linear system solution. Csanky 

has shown that the linear system can be solved in 0(log 2 n) steps with 0(n 4 ) 

processors [23] . This implies that the forward dynamics problem belongs to 
the class of NC. Note that, using Cramer’s rule, the linear system solution 
can be computed in O(log n) steps with 0(n! ) processors [20], But such a 
result has neither theoretical nor practical importance. 

However, Csanky’ s algorithm is unpractical since, besides using too many 
processors, it is numerically unstable [25], The best stable algorithms for 

linear system solution achieve a time of 0(n) with 0(n 2 ) processors [24] -[25], 

Hence, parallelization of the 0(n 3 ) algorithms results in the stable 0(n) 

parallel algorithms with 0(n 2 ) processors, which indicates an unbounded 
parallelism. 

The above analysis shows that the forward dynamics problem belongs to the 
class of NC and that the best known upper bounds on the time and processors 

are 0(log 2 n) and 0(n 4 ), respectively. Practically, however, the fastest (and 

stable) parallel algorithm for its computation is of 0(n). With respect to 
these results the main question is, given the fact that both the serial 0(n) 

3 

and 0(n ) algorithms result in the 0(n) parallel algorithms, which one is more 
efficient for parallelization? 

Let denote the polynomial complexity of the serial 0(n) algorithms. 

There is a limited parallelism in both coarse grain and fine grain (in 
matrix-vector operation) forms in these algorithms [8], Exploitation of this 
parallelism leads to the parallel algorithms with polynomial complexity as 
due to the limited parallelism, oc^ is reduced to a. only by a 

small factor. Furthermore, exploitation of both coarse and fine grain 

parallelism requires additional architectural complexity. For the 0(n 3 ) 
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algorithms, the polynomial complexity of the resulting parallel 0(n) algorithm 
is of the form o. n+ 3 r^["log^n"| where oc_^ is smaller than <x^ by more than two 

orders-of-magnitude. As a result, while the algorithm is asymptotically faster 
than the serial 0(n) algorithms and their parallel versions by a high constant 
factor, it is also more efficient for small n. The price to be paid for this 

efficiency, of course, is an architecture with 0(n 2 ) processors. However, the 
efficiency of the parallel algorithm and the suitability of the architecture 

3 

for VLSI and WSI Implementation strongly support the choice of 0(n ) 
algorithms for parallel computing. 

III. PARALLEL COMPUTATION OF INERTIA MATRIX 

A. Basic Algorithms for computation of inertia matrix 

From Eq. (3) the elements of the inertia matrix can be computed as 


for the condition given by 

Q = 1 and Q =0 For k =1, 2 n (6) 

I k*l 

Two physical interpretations can be thought for the above condition, with each 
interpretation leading to a distinct class of algorithms as 

1) The first i-1 links do not have any motion, that is, they are static, and 
the accelerations and the forces/torques of the last n-i+1 links result from 
the unit acceleration of link i. This interpretation leads to the first class 
of algorithms, designated as the class of Newton-Euler Based (NEB) algorithms, 
in which the diagonal and lower off-diagonal elements of the inertia matrix 
are computed. In [7] an algorithm of this class is presented, designated as 
the Original NEB (ONEB) algorithm, which computes the inertia matrix by 
successive applications of the N-E formulation. 

2) The last n-i+1 links can be considered as a single composite rigid 
system, since they do not have any relative motion, which is accelerating in 
space, leading to the exertion of forces and moments on the first i-1 static 
links. This interpretation leads to the second class of algorithms, designated 
as the class of Composite-Rigid Body (CRB) algorithms, in which the diagonal 
and upper off-diagonal elements of the inertia matrix are computed. In [7) an 
algorithm of this class, designated as the Original CRB (OCRB) algorithm, is 
presented in which the center of mass and the first and the second moment of 
mass with respect to the center of mass of a set of composite systems are 
computed. 

The comparative study in [7] shows that the OCRB algorithm achieves a 
significantly greater efficiency over the ONEB algorithm. In [8]-[9], we have 
developed an algorithm, designated as the Variant of CRB (VCRB) algorithm, 
which avoids the redundancies in the OCRB algorithm and represents the most 
efficient algorithm (to date) for computing the inertia matrix. Note that, 
however, due to the symmetry of the problem, both interpretations and hence 
both classes of algorithms should lead to the same results and computational 
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Fig. 1. Comparison of different algorithms for computation of inertia matrix 


efficiency. In [8] -[9], we have shown that, by introducing or reducing the 
redundancy in the computation, the algorithms of the two classes can be 
transformed to one another and, particularly, to the most efficient one, the 
VCRB algorithm. Figure 1 shows the relative serial efficiency of and 
redundancy in different algorithms. 

Although the results presented in Fig. 1 answer the question of the 
serial efficiency of different algorithms, it does not indicate which 
algorithm provides the most suitable features for parallelization. For serial 
processing, removing any redundancy increases the computational efficiency. 

For parallel processing, however, depending on its impact on the data 
dependency in the computation, this may increase or decrease the efficiency. 
The fact that arbitrary algorithms can be developed by introducing or removing 
different types of redundancy In the computation represents an additional 
degree-of -freedom that can be exploited to derive an algorithm which, though 
perhaps not efficient for serial computing, is the most suitable for 
parallelization. In [8], [12], we have shown that the NEB algorithms are more 
suitable for parallel computing than the CRB algorithms. In fact, they not 
only achieve a better computational complexity (in the parallel sense) but 
also require a less complex communication and synchronization mechanism. This 
better efficiency for parallelization mainly results from the fact that the 
evaluation of the columns of the inertia matrix by the NEB algorithms is order 
independent and hence can be done in parallel. 


B. A Variant of Newton-Euler Based (VNEB) Algorithm 

Four different types of redundancy can be recognized in the ONEB algorithm, 
which can be eliminated, respectively, by [8], [9], [13]: 

1) Choosing a more suitable coordinate frame for projection of the equations. 

2) Optimizing the N-E formulation for the condition given by Eq. (6). 

3) Using a more efficient variant of the optimized N-E formulation. 

4) Introducing a two-dimensional recursion in the computation. 

Note that the first redundancy resides in the extrinsic equations and 
results from the choice of coordinate frame for projection of the intrinsic 
equations while the second, the third, and the fourth redundancies reside in 
the intrinsic equations and are inherent in the formulation. As stated before, 
by removing all redundancies in the intrinsic equations, the ONEB algorithm 
can be transformed to the VCRB algorithm. However, removing any type of 
redundancy in the NEB algorithms, as far as the order independence property 
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is preserved, will also Increase the efficiency of their parallel versions. 

In this regard, only the removal of the fourth redundancy, which leads to the 
Introduction of a two-dimensional recursion In the computation, results In the 
loss of the order independence property of the algorithm. In the following, a 
Variant of the NEB algorithm, designated as the VNEB algorithm. Is presented 
which Is developed by removing the first three redundancies. 

The derivation of the Variant of N-E Based (VNEB) algorithm Is fully 
discussed In [8], [9], [12], [13). Here, for the sake of completeness, a brief 
description of the algorithm is given. The algorithm Is presented by the 
intrinsic equations. In this paper, according to the Gibbs notation, vectors 
are underlined once and tensors (tensors of order 2) twice. Also, In order to 
simplify and, particularly, unify the derivation of the serial and parallel 


algorithms, a set of notations, given In Table I and Fig. 3, are used. The 
VNEB algorithm is then written as 

For 1=1, 2, .... n 

For j = 1, 1+1 n 

u(J, 1) = Z(i) (7) 

V(J.i) = V(J-l,i) + u(j.i)xP(J.J-l) = w(J, i)xP(J, 1) (8) 

F(j+l.j,l) = M(J)V(J,i) + w( J, i )xH( j) (9) 

N(j+1, j.i) = K(J)a(J.D (10) 

For J = n, n-1 1 

F(n+1 , j, 1 ) = F( J+1, j, i ) + F(n+1, j + 1, 1 ) (11) 

N(n+1, J, 1) = N( J+1, j. 1 ) + N(n+1, J+1,1) + P(J+1, J)xF(n+l, j+1, 1 ) (12) 
with F(n+1, n+1, 1) = N(n+l,n+l,i) = 0 

a j( = Z( j). N(n+1, j+1, i ) (13) 


C. Parallel Algorithm for Computation of Inertia Matrix 

The serial computational complexity of evaluating the Inertia matrix Is 
2 

of 0(n ). No serial algorithm can achieve a better asymptotic complexity 

2 

since, given n inputs (joint positions), the evaluation of the 0(n ) outputs, 

the elements of the inertia matrix, requires 0(n ) distinct steps in the 
computation. Based on the VCRB algorithm, we have already shown that the 

inertia matrix can be computed in 0( log^nJ+Ot 1 ) steps with 0(n ) processors 

[8], [11]. It is interesting to note that not only the same bounds on time and 
processors can be much more easily derived by parallelization of the VNEB 
algorithm but also the resulting parallel algorithm, compared to the parallel 
VCRB algorithm, reduces the coefficients on the polynomial complexity. 

For the implementation of the parallel algorithm achieving the time lower 
bound, we consider a two-dimensional array of n(n+l)/2 processor-memory 
modules represented as PR^ , for i = 1, 2, n and j = i, i + 1 n 
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(Fig. 3 shows the array for n = 6). For the parallel algorithm, the equations 
are projected onto the EE coordinate frame, coordinate frame n+1. An n DOF all 
revolute joints manipulator is considered for which the joint variables are 
the joint angles, that is, = 0^. It is assumed that the joint variables 0^ 

(or S0^ and C0^) and constant parameters Sa^, Ca^, ^ +1 P(j+l,j), J+1 H(j), 

J+1 K(j), and M(j) reside in the memory of all processors of Row j. The 
analysis of the mapping the algorithm onto the architecture of Fig. 3 is fully 
presented in [12]. Here, due to the lack of space, we only give the a brief 
description of the algorithm in terms of its computational steps and cost. 


For the parallel algorithm, the ith column of the inertia matrix is 
computed by the processors of the ith column of the processor array. The fact 

that <j(j,l) = Z(i) for j = i, i+1 n. Implies that global communication 

of Z(i) among the processors of the ith column is required. This requirement 
can be avoided by introducing two recurrences as 

w(j, i) = w(J-l, i) = Z ( i ) (14) 

P(J,1) = P(j-l.i) + P(j. j-1) (15) 

Equation (14) does not need any computation while, for the parallel algorithm, 
the computation of Eq. (15) is required. By computing Eqs. (14)— (15) as a set 
of coupled recurrences, the terms w(j-l,i) can be considered as the data 
associated with Eq. (15). Using such a scheme increases the communication 
complexity of parallel evaluation of Eq. (15) but will result in the global 
distribution of Z(i) among the processors of the 1th column. The computation 
of the parallel algorithm is then performed as follows. 

Step 1: 

1) Parallel compute R(j+1, j) by all processors of the jth row. 


For j = 1, 2 .... , n 


For 1 = 1, 2 j 

PRjj : R(j+l.j) 

2) Parallel compute R(n+l,J) by processors of the ith column. 

For 1 = 1, 2 n 

For j = i, i+1, .... n 

For T) = 1 step 1 until flog 2 (n+l-j)] , Do 
R(j+2 7) , j) = R(n+1 , j ) 

R( j+2 71 , j ) = R(n+l,J) = R(n+1, J+2™ )R( j+2 7) '\ j) 
R(j+2 T, ,j) = R(j+2 T) , j+2 T) ' 1 )R(j+2 T>_1 , j) 


(16) 


(17) 

j+2 T, >j+2 T,_1 2:n+l 

j+2 1, an+l>j+2 1f) " 1 

n+l>j+2 7) >j+2 T)_1 


End_Do 

3) Rotate R(n+l,j) by processors of Row j to the processors of Row j-1. 

For j = 1, 2 n 

For i = 1, 2 j 

PR : R(n+1 , J+l ) (18) 
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with R(n+l,n+l) = U (Unit Matrix) 

Note that , as the result of the above data transfer, both the terms R(n+1, j) 
and R(n+1 , j + 1 ) reside in the memory of all the processors of the jth row. 

4) Parallel compute n+1 Z(j), n+1 P( j + 1 , j ) , n+l R(J), and nl K(j) by all the 

processors of the jth row. 


1. 

2 ,.. 

• . n 


i = 

= 1. 2 

j 


a) 

PR : 
Ji 

n+1 Z(j) = R(n+1, j) J Z(j) 

(19) 


with 

J Z(j) =[00 1 ] 1 


b) 

PR : 
Ji 

ntl P(j+l,j) = R(n+1, j+l) J+1 P(j+l, j) 

(20) 


with 

J+1 P( j + 1 , J) = [a t d i Sa i d^aj 1 


c) 

PR : 
ji 

n+1 H(j) = R(n+1, J+l) J+1 H(j) 

(21 ) 

d) 

PR : 
Ji 

n+1 K(j) = R(n+1 , j+1 ) J+1 K( j )R( J+1, n+1 ) 

(22) 


Note that for the processors of the nth row Eqs. (21)-(22) do not need any 

computation since the terms n+1 H(n) and n+1 K(n) are given constant parameters. 
As the result of the computation of Step 1, all the vectors and the tensors 
are projected onto the coordinate frame n+1. In the following, the absence of 
superscripts denotes that the computations are performed in this frame. 


Step 2: 

1) Parallel compute P(J,i) and u>(j,i) by processors of the ith column. 
For i = 1 » 2, ...» n 

For j = i> i+1, ...» n 

For T) = 1 step 1 until |"log 2 (n+l-j)] , Do 


j+2 11 . i) = i) = Z(l) 

PCj+Z^.j) = PU.j) 

P( j+2 11 . j ) = P(i.j) = P(i, j+2™) + PCJ+Z^.J) 
P(j+2 T, .j) = P^ 7 *, j+2™) + P(J+2 T, " 1 ,j) 


(23) 

j+2 1, 2:i>j+2 T, ~ 1 (24) 
i>j+2 T, >j+2 1)_1 


End Do 


2) Parallel compute V(j,i), F(j+l,j,i), N(j+l,j,i) by processors of the ith 
column. 

For i = 1 , 2, . . . , n 

For j = i, i + 1, . . .n 

PR jt : V( j, i ) = w(j, i)xP(j, i) (25) 
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(26) 


PR jt : F( j+1, j, i ) = w( J, i )xH( J) +M(J)V(J,i) 

PR^: N( j+1 , J, 1 ) = K(j)«(J, 1) + H( J)xV( J, 1 ) (27) 


Step 3: 

1) Parallel compute F(n+l,j,i) and N(n+l,j,l) by processors of the ith column. 

For 1 = 1, 2 n 

For J = 1, 1 + 1 n 

For 7) = 1 step 1 until flog^n+l-j)] , Do (28) 

F(j+2 71 , J. 1 ) = F(n+1, j,i) j+2 T, >j+2 T, " 1 in+l 

F(J+2 T> ,J,i) = F(n+1, J, 1 ) = F(j+2 7 \j+2™.i)+F(j+2 7, ~\ j,i) j+2 T) >n+l> j+2 T,_1 
F(j+2 T) .j.i) = F(j+2\ j+2 n ' l .i)+F(j+2 T, " 1 f j. 1 ) n+l> j+2 t) > j+2 11 " 1 


End Do 


For T) = 1 step 1 until flog 2 (n+l-j)] , Do 
N(j+2 T) , j, 1) = N(n+1, J, 1 ) 

N(J+2 T, ,J.l) = N(n+1, j, 1) = N(n+1, j+2 T,_1 , i)+N( j+2 7 *” 1 
Pii+Z 1 ''', J)xF(n+l, j+2 7 *' 1 , 1) 
N(J+2 T, .J.i) = N(j+2 7 *. J+2 T) ‘ 1 , i)+N(J+2 T, " 1 > J,i) + 
P(J+2 1h \ J)xF(J+2*. J+2 11 " 1 , 1) 


(29) 

j+2 T) >j+2 T)_l 2:n+l 

. j, D + 

j+2 7, in+l>j+2 7,_1 

n+l>J+2 17 >j+2 T,_1 


End Do 


2) Parallel compute a by PR 
F Ji Ji 

For i = 1, 2 n 

For J = i, i+1, . . . n 

PR^: a j4 = Z( j ) . N(n+1 , j, 1 ) (30) 

As stated before, the time lower bound in, as well as the computational 
cost of, parallel evaluation of the inertia matrix, using the VNEB algorithm, 
is determined by that of the parallel evaluation of the first column of the 
inertia matrix. In other words, the computational cost of the n recurrences in 
Eqs. (17), (23), (24), (28), and (29) is determined by that of the largest 
ones, i.e., for 1=1, which are of size n. Furthermore, the computational 

cost of all the 0(n 2 ) terms in Eqs. (16), ( 1 9 ) — ( 22 ) , (25)-(27), and (30) is 

determined by the cost of one term since for each column n terms are computed 
in parallel and the computation for n columns, as will be discussed later, is 
overlapped. Let m and a denote the cost of multiplication and addition, 
respectively. The computational cost of the parallel algorithm is then 
evaluated as follows: 
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Step 1: The cost of Eq. (16) is 4m; The cost of Eq. (17) is (27m+18a ) [log^] ; 

Eq. (18) represents a simple data rotation; Eq. (19) does not need any 
computation; The cost of Eqs. (20), (21), and (22) is (9m+6a), (9m+6a), and 

(54m+36a), respectively. The cost of this step is (27m+18a) [log^n] +(76m+48a) . 

Step 2: Eq. (23) does not need any computation; The cost of Eq. (24) is 
(3a) [log 2 n] ; The cost of Eqs. (25), (26), and (26) is (6m+3a), (9m+6a), and 

(15m+12a), respectively. The cost of this step is (3a) [log^n] + (30m+21a ) . 

Step 3: The cost of Eqs. (28) and (29) is (3a)[log 2 n] and (6m+9a) flog^] , 

respectively; The cost of Eq. (30) is (3/n+2a). The cost of this step is 
(6m+12a) [log 2 n] + (3m+2a). 

Adding the cost of Steps 1-3, the computation cost of the algorithm is 
obtained as (33in+33a) [log 2 n] + ( 109m+71a) . As stated before, mapping the 

developed parallel algorithm onto the processor array is fully presented in 
[12] where it is shown that, even using a simple nearest neighbor 
interconnection structure, a communication complexity of 0(n) can be achieved. 
Also, the mechanisms for global and local synchronization for the processor 
array are presented. Figure 4 shows the resulting distribution of the 
elements of the inertia matrix among the processors. 


IV. PARALLEL COMPUTATION OF N-E FORMULATION AND SOLUTION OF LINEAR SYSTEM 

As stated before, the bias vector can be computed by evaluating the N-E 
formulation while setting the vector of Joint accelerations to zero. We use 
the parallel algorithm presented in [15] for computing the bias vector. This 
computation is performed by the processors of the first column with the 
equations being projected onto the frame n+1. Therefore, the results of the 
computation of the first step except Eqs. (21)-(22) can be used while, similar 

to the terms n+1 K(j) in Eq. (22), the terms n+1 J(j), for j - 1, 2 n, are 

also needed to be computed by the processors of the first column. The cost of 
evaluation of the vector T is then obtained as 15a[”log 2 n] + (141m+101a). With 

the nearest neighbor interconnection among the processors of the first column 
a communication complexity of 0(n) is achieved. In order to achieve the proper 
sequencing of the computation of the inertia matrix, the bias vector, and the 
linear system solution, a data-driven mechanism can be employed. That is, the 
processors of all columns, except the processors of the first column, by 
completion of the computation of their corresponding column of the inertia 
matrix enter the wait state while the processors of the first column start the 
computation of the bias vector and the vector I\ The activity of the 
processors of column 2 through column n in linear system solution is triggered 
by receiving the corresponding data and by the completion of the computation 
of the vector T. 

Figure 4 shows the organization of the data resulting from the computation 
of the inertia matrix and the vector T. Given this data organization, we have 
developed synchronous data-flow parallel algorithms for the full solution of 
Eq. (3), that is, for decomposition of the inertia matrix, using the square- 
root-free variant of the Cholesky factorization, and the solution of the 
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resulting lower and upper triangular linear systems, which are presented in 
detail in [12]. The computation cost of the solution of Eq. (3) is obtained as 
(n-1 ) (3m+2a)+ ( lm+la) , where the cost of division and multiplication is taken 
to be the same. A communication complexity of 0(n) is also achieved. 

Adding the cost of evaluation of the inertia matrix, the bias vector and 
the vector T, and the linear system solution, the computational cost of the 
forward dynamics problem is obtained as (3m+2a)n+(33m+48a) flog 2 n]+(238m+171a). 

The computational cost of the best serial 0(n) algorithm, the Articulated-Body 
algorithm [ 3 ] — [ 4 ] , is evaluated as (380m+302a)n-(198m+173a) [15]. If the time 
of multiplication and addition is taken to be the same, then, for n = 6, the 
developed parallel algorithms achieve a speedup of 6 compared to the best 
serial 0(n) algorithm. Note, however, that as n increases, e.g., for redundant 
manipulators, the speedup also significantly increases. To see this, let us 
write the computational cost of the serial 0(n) algorithm and the parallel 
algorithms as 682n+0(l) and 5n+0( log 2 n)+0( 1 ) , with the time of multiplication 

and addition taken to be the same. It can be seen that, asymptotically, the 
parallel algorithms achieve a speedup of more than two orders-of-magnitude. 

For small n, the computational cost of the parallel algorithms is dominated by 
the [log^] -dependent and constant terms on the polynomial complexity. Hence, 

for small n, the computational complexity of the developed parallel algorithms 
can be practically considered as 0( log 2 n)+0( 1 ) . 

V. CONCLUSION 

We developed a set of parallel algorithms for computing the forward 
dynamics problem. These algorithms exploit a high degree of parallelism in 
the problem and achieve a significant speedup in the computation. Furthermore, 
they can be efficiently implemented on a two-dimensional array of processors 
with a nearest neighbor interconnection. This architecture is particularly 
suitable for practical implementation using VLSI and WSI technologies. Due to 
their simple architectural requirements, these algorithms, with some 
modifications, can also be efficiently implemented on rather more 
general-purpose architectures, e.g., a two-dimensional array of Transputers. 

A key factor in our approach to the parallel computation of the forward 
dynamics is the minimization of the resulting overhead. The overall 
communication complexity is of 0(n). The overhead is further minimized since 
there is no need for any data alignment between the computation of different 
algorithms and the intermediate data resulting from the different algorithms 
are generated and consumed within the array. Also, the final result of the 
computation, that is, the vector of joint accelerations, is computed by the 
processors of the first column. Therefore, they can be output using the same 
channels for inputting the data to the array (Fig. 5). This is particularly 
critical for VLSI and WSI implementation since, by using only n bidirectional 
Input/Output channels, the number of required pins is kept small. 
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a^, d^ , & Length, Distance, and Twist Angle of link i, respectively. 

Q Q & Q i Position, Velocity, and Acceleration of joint i, respectively. 
M(i) Mass of link i. 

J ( i ) Second moment of mass of link i about its center of mass ( C^ ) . 

H(i) First moment of mass of link i about point 0 . 

K(i) Second moment of mass of link i about point 0^. 

Z(i) Axis of joint i 

P(i, j) Position vector from point 0^ to point 0^. 

R(i,j) A 3x3 matrix representing the orientation of coordinate frame j 

with respect to coordinate frame i. 
u(i, j) Angular acceleration of link i resulting from the unit 

acceleration of joint j. 

V(i,j) Linear acceleration of link i (point CM resulting from the unit 

acceleration of joint j. 

F(k + 1 , i , j ) Force exerted on point 0^ due to the acceleration of links i 

through k, i.e., the links contained between points 0 and 0 , 

resulting from the unit acceleration of joint j. 

N(k+l,i,j) Moment exerted on point 0^ due to the acceleration of link i 

through k, resulting from the unit acceleration of joint j. 

Table I. Notation used in derivation of serial and parallel algorithms 
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Figure 3. A Two-Dimensional Processor Array (a) Data Input to 
Processor Array, (b) Distribution of Input Data Among Processors. 



Figure 4. Organization of Data 

Resulting from Computation of the Figure 5. Data Output From 
Inertia Matrix and the Bias Vector. Processor Array. 
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Abstract 


This paper proposes a Non-Strict computational approach for real-time robotics control com- 
putations. In contrast to the traditional approach to scheduling such computations, based 
strictly on task dependence relations, the proposed approach relaxes precedence constraints 
and scheduling is guided instead by the relative sensitivity of the outputs with respect to the 
various paths in the task graph. An example of the computation of the Inverse Dynamics of 
a simple inverted pendulum is used to demonstrate the reduction in effective computational 
latency through use of the Non-Strict approach. A speedup of 5 has been obtained when 
the processes of the task graph are scheduled to reduce the latency along the crucial path of 
the computation. While error is introduced by the relaxation of precedence constraints, the 
Non-Strict approach has a smaller error than the conventional Strict approach for a wide 
range of input conditions. 


I. Introduction 

Complex robot dynamics computations have typically been represented using directed 
task precedence graphs, in order to facilitate the exploitation of parallelism in their exe- 
cution [1,2,3]. The nodes of such a task graph represent computational modules, with the 
directed edges imposing a strict partial order on their execution sequence. While such a 
computational approach is faithfully accurate to the underlying physical model of the robot 
system if executed instantaneously, in practice the computational latency can be significant 
even with state-of-the-art computers. The use of pipelining also does not reduce this latency, 
even though it increases computational throughput. 

The Non-Strict computational approach is motivated by a need to reduce the “effective 
latency” from input to output for complex robotics computations. The terms Strict and 
Non-Strict are derived from the manner in which precedence is treated in scheduling the 
tasks of a task graph such as the one shown in Fig. 1. The approach proposed here has the 
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same motivation as the “Imprecise Results” approach recently proposed in [4], but uses a 
different model. The key concept behind our Non-Strict approach is that of relaxing the strict 
precedence constraints imposed by a conventional task-graph model, to facilitate trading off 
accuracy for timeliness in real-time computation over sampled, continuously varying inputs. 
Rather than sample all inputs simultaneously and then schedule tasks in strict adherence 
to the precedence constraints dictated by the edges of the task graph, tasks are scheduled 
so as to minimize delay between input and output along paths in the task graph that most 
strongly affect the output, perhaps using some previously computed values along less crucial 
paths. 

Consider the task graph shown in Fig. 1. The circles represent computational tasks 
to be performed and the arcs represent data flow and thus precedence. According to the 
conventional Strict model of computation, the input, q, should first be sampled. Then it 
should proceed through the sequence of 1-2-3-4-5, completing the computation by furnishing 
the output, r. The latency between input and output is just the total computation time; it 
represents a lag in the real-time control implementation and tends to destabilize the system. 

If the output of process 5 is more strongly affected by the output of process 4 and further, 
if process 4 is strongly affected by the output of process 1 and only weakly dependent on 
process 3, then a better schedule, which does not strictly adhere to the natural precedence 
relationships, may be developed. In particular, it may be best to sample the input, compute 
process 1, resample the input, compute process 4 using the last computed value of process 3, 
and compute process 5 to generate the output. After process 5 is complete, then processes 
2 and 3 may be scheduled which gives the following ordering on the computations 1-4-5-2-3 
which does not follow strict precedence. However, if the 1-4-5 path from input to output is 
the most crucial to the computation, then the effective latency has been reduced. This is 
especially true if the computation times of processes 2 and 3 are long as compared to the 
others. 

From the above example, several important characteristics of the Non-Strict computa- 
tional approach may be noted: 

1. It is appropriate to the case of real-time control in which the same computation is to 
be repeated over successive time samples of the inputs. 

2. The main objective is to relax the precedence so as to reduce the effective latency from 
input to output. Rather than sampling the inputs once for a computation so that the 
results are strictly correct, albeit delayed, before each process which requires an input, 
a fresh sample of the input is taken so that the effects of delay are minimized. Also, 
computation generally proceeds along the crucial paths using the freshest (but past) 
values generated by the other less crucial processes. While this is often not according 
to precedence, again it can have the effect of reducing the latency. 

3. While the Non-Strict approach does introduce error into the computation by violating 
task-graph precedence constraints, and can therefore only be an approximation of the 
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Figure 1: Example Task Graph to Illustrate Non-Strict Computation. 

ideal output, the maximum instantaneous error and average error are often much less 
when compared to that resulting from the use of a Strict approach (which produces 
an output with the same shape as the ideal, but delayed in time). 

4. By relaxing precedence constraints, potential parallelism may be increased. That is, 
the task precedence relationships constrain the flow of computation to certain se- 
quences/orderings. With these relaxed, a greater amount of parallel scheduling of 
processes may proceed. 


In order to further develop the basic concepts, an example of the dynamics of a simple 
inverted pendulum will be given. A description of the system, including its dynamic equation 
of motion and task graph for its evaluation, will be given in the next section. Following that, 
results using the Non-Strict computational approach will be given and will be compared with 
the Strict approach. Finally, the paper ends with a summary of the results, and conclusions 
on which to base further investigations. 
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II. Simple Inverted Pendulum Example 


The computation of the dynamics of the simple inverted pendulum system shown in 
Fig. 2 is to be considered. The system consists of a single link which is connected through a 
one- degree-of- freedom revolute joint to a fixed base. Gravity acts in the vertical direction, 
and an actuator is mounted at the joint to power the pendulum movement. While the model 
is quite simple, it has been used in the past to study the dynamics of biped walking in the 
single support phase of locomotion [5]. 

The basic computational task is to solve the Inverse Dynamics problem [6] in real time. 
The dynamic equation of motion for the pendulum may be written as follows: 


r = Iq + Bq — mgl sin (q) 


where 


(i) 


9 , 9 , q = 

I = 
B = 
m = 
/ = 


Actuator torque, 

Joint position (as referenced to the vertical), 
velocity, and acceleration 

Moment of inertia of the link about the joint axis, 
Joint actuator damping coefficient, 

Mass of the link, and 

Position of the center of gravity of the link 
from the joint axis. 


Note that values for the system parameters are also given in Fig. 2. 

For Inverse Dynamics, the joint position, velocity, and acceleration are given and the 
joint torque is to be computed. Inverse Dynamics is used in advanced control schemes for 
robotic systems to determine the torque required for a desired motion trajectory and must 
be computed at real-time rates of up to a few hundred hertz [7]. 

A task graph for the Inverse Dynamics computation is shown in Fig. 3. There are 6 
processes with the top number in the circle giving the process number. The operation 
performed is given in the middle section of the circle while the computation time is given in 
the bottom part and is normalized to units of a basic computation time, A. Note that it is 
assumed that adds and multiplies take a single unit of time and that the sine computation 
takes ten times as long. For the purposes of this example, it is assumed that the input 
position, q , is a simple sinusoid: 


q(t) = sin(a;f) . 


( 2 ) 
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I = 15 ft 

I = 1.86 ft-lb-s 2 

B = 30 ft-lb-s 


Figure 2: Simple Inverted Pendulum System. 


i<t) i<t) i<t) 



Figure 3: Task Graph for Inverse Dynamics Computation. 
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where u is the frequency. 


III. Computational Results 

In this section, using the Inverse Dynamics of an inverted pendulum as an example, 
a number of results will be given which illustrate the Non-Strict computational approach 
and compare it with the Strict approach. The output of the computation under the Strict 
and Non-Strict approach are contrasted for a simple sinusoidal input to the system. If the 
computational delays in evaluating the component operations in the task graph were zero, 
then both the Strict and Non-Strict approaches would yield identical results. With the Strict 
approach, the generated output is identical in shape to the ideal output, but shifted in time 
by the sum of the computational delays of the tasks in the task graph. With the Non-Strict 
approach, the output (in general) approximates the shape of the ideal output but is not 
identical in form. However, the overall error (including the effects of time delay) can often 
be significantly less than the error with the Strict computation. 

In this section, the two approaches are compared using two measures - 1) mean square 
error , and 2) effective latency. The mean square error over a period is defined as: 

mean square error = — / — x(f)] 2 dt (3) 

1 Jo 

where T is the period of the input and r^t) is the ideal output. The effective latency A t efS 
is defined in the following way: 


At e // = At : Min ^ jf [r;(< - At) - r(t)] 2 dt. (4) 

Thus a best fit is attempted between a delayed version of the ideal output and the Non- 
Strict output, and the shift in the delayed version of the ideal output is interpreted as an 
effective latency of the computation. As long as the mean square error between the output 
and the shifted ideal output is sufficiently small, the error in output may be related to that 
arising from a time delay of the ideal output. Thus, the outputs of the Strict and Non-Strict 
approaches may be compared in terms of effective computational latencies, where the scheme 
with the lower effective latency is preferable. 

In the following, speedup through reduction in the effective latency of the computation 
will be shown. The relationship of the error in both the Strict and Non-Strict cases will 
be given as a function of the input frequency. The objective is to determine a range of 
frequencies for which the Non-Strict approach will give speedup while maintaining reasonable 
limits on the error. Finally, the relationship between the effective latency and the input 
frequency will be investigated and will provide motivation for adaptive scheduling strategies. 
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Figure 4: Plot of the Torque for the Ideal, Strict, and Non-Strict Cases Over a Period 
(uj = 12.5 rad/s and A = 7r/1500 s). 

A. Reduction in the Effective Latency 

Fig. 4 compares the outputs of Strict and Non-Strict evaluations of the task graph of Fig. 3 
against the ideal output obtained with zero computational delay. Results for = 12.5 rad/s 
and A = 7r/1500 s are given. If the tasks are scheduled in the order 1-2-3-4-5-6 instead of 
an order dictated by strict adherence to task precedence constraints, then speedup should 
result by a reduction of the effective latency of the computation. In particular, at high 
frequencies when the inertial term dominates the torque computation, it is anticipated that 
the computational delay may be as little as 3A since this is the most crucial path in that 
case. Of course, the schedule is not according to strict precedence and some amount of error 
will be introduced. 

Note that the Strict curve is an exact form of the ideal but delayed in time by 15 A. While 
the Non-Strict curve is not an exact form of the ideal, it gives the least amount of error for 
almost the entire period. For the present results, little difference can be seen between the 
shape of the ideal and Non-Strict curves. In fact, the Non-Strict curve does not quite reach 
the desired peak and the slopes are slightly different along the trajectory. However, the 
effective latency has been reduced considerably to as little as 20-30% of the Strict case. 
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Figure 5: Mean Square Error for the Strict and Non-Strict Cases as a Function of the 
Frequency (A = 7r/1500 s). 

B. Error Vs. Frequency 

If the frequency is increased such that the period is exactly 15A, then the Strict approach 
will give perfect results. The question becomes: is there a range of frequencies over which 
the Non-Strict approach will give the best results (least error)? Toward this end, the error 
of the Strict and Non-Strict approaches with respect to the ideal have been evaluated as a 
function of the frequency. Typical results are given in Fig. 5. 

The results indicate that the Non-Strict approach has a smaller error up to a frequency 
of approximately u = 168 rad/s when A = tt/1500. As might be expected, this crossover 
frequency varies with the value of A chosen. As the value of A increases, the crossover value 
of frequency decreases. In fact, as long as the value for A is less than a certain fraction of 
the period, then the Non-Strict approach will be better. For the example system, this ratio 
has been determined as: 


— < 0.056. 
T “ 


( 5 ) 


For this example, if the Strict computation delay (15A) is less than approximately 80% of 
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Figure 6: Effective Latency as a Function of Frequency for the Schedule (1-2-3-4-5-6). 
the period, then the Non-Strict approach will give better results. 


C. Effective Latency Vs. Frequency 

The schedule assumed for the previous results tends to work well for high input frequen- 
cies, since in this region the crucial path is 1-2-3 and is scheduled first. Then for very high 
frequencies, it is anticipated that the overall effective latency will be approximately 3A which 
is the computational delay along this particular path. At low frequencies, the gravitational 
term will dominate the inertial term. With the same schedule, then, it is anticipated that 
the effective latency will be longer. In fact, the delay from the input q to the output r is 
seen to be 15A. Therefore, the effective latency of the computation should be in the range 
of 3 — 15A and will vary with the frequency. 

Results have been obtained for the effective latency as a function of the frequency and are 
shown in Fig. 6. As expected, the effective delay is 15A at low frequencies and decreases to 
3A at high frequencies. The crossover takes place in the region between u> = 0.1 to u> = 10.0 
during which the significance of the inertial, damping, and gravitational terms changes. 

If operation of the inverted pendulum is at lower frequencies, then an alternate schedule 
may be more appropriate. In particular, the schedule 4-5-3-1-6-2 should have an effective 
latency of 12A at low frequencies - better them the previous schedule. Results for this case 
are given in Fig. 7. The results Eire as expected. Note that the delay at high frequencies is 
15A because of the long delay from sampling q to the output. Also note that the damping 
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Figure 7: Effective Latency as a Function of Frequency for the Schedule (4-5-3-1-6-2). 

term with q appears to dominate at approximate w = 4 so that the effective delay is 14A. 

In fact, in general, the schedule should vary depending upon the rate of change of the 
inputs, and their effect on the output. For low frequencies, the path 4-5-3 should be favored 
while for high frequencies, the path 1-2-3 should be favored. This gives rise to a need for an 
adaptive scheduling scheme and will be considered in future investigations. 


IV. Summary and Conclusions 

This paper has proposed a Non- Strict computational approach attractive for real-time 
applications such as robotics control. In this approach, the precedence set in a task graph 
model of the computation is relaxed so as to reduce the effective latency from input to 
output. This is achieved by appropriately ordering the execution of the processes so that 
the most crucial path from input to output is given priority. While some amount of error 
is introduced since the precedence relationships are not strictly adhered to, there are many 
cases in which the overall error is less than that introduced by the sheer delay of the Strict 
case. 

The Non-Strict approach has been applied to compute Inverse Dynamics [6] for a simple 
inverted pendulum. At high frequencies of motion, the effective latency was reduced from 
15A to 3A, giving a speedup of 5. For a simple sinusoidal input on the joint angle, the 
error for the Non-Strict case was better than the Strict case as long as the Strict compu- 
tational delay was less than approximately 80% of the period. This is the case in practical 
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applications. The effective latency for the Non-Strict case was shown to be a function of the 
frequency of the sinusoidal input. This results from a change in the crucial path through the 
task graph as the frequency varies. Two different schedules showed the range of possibilities 
for reducing the effective latency and motivated the need for an adaptive scheduling scheme. 

There axe a number of avenues for further work which are presently being explored. First 
of all, speedup has been obtained here for the case of a serial machine. The use of a Non- 
Strict computational approach can clearly be expected to have an impact on exploitation of 
parallelism. The strict precedence constraints imposed by a conventional task-graph model 
tend to limit parallelism. The relaxation of precedence constraints with the Non-Strict 
approach should thus be conducive to the better exploitation of parallelism. However, in the 
parallel context, the determination of an optimal scheduling order becomes more difficult, 
with the two inter-related criteria of 1) task ordering for minimization of Effective latency” 
through use of a measure of “task crucialness”, and 2) minimization of computational latency 
through maximization of processor utilization. 

A second issue is that of adaptive scheduling. Based on the rate of change of a process 
output and its significance to the computation, a constantly adapting schedule should give 
better results than one which is fixed. Also, processes need not be scheduled at the same 
rate and such multi-rate schemes may be especially important in effectively utilizing the 
resources of a parallel processor system. Further work in scheduling strategies is needed. 

Another area in which the work may be extended involves the use of prediction to reduce 
the effective latency and computational error. In particular, if prediction is made on the 
inputs before use by a process, then it may be possible to reduce the effective process 
latency and therefore the overall latency. Investigations into appropriate schemes for using 
the prediction are needed. 

Finally, the applicability of the Non-Strict computational approach to a wider class of 
real-time control problems should be investigated. The preliminary results presented in this 
paper show much promise for speedup and should be applied to other difficult real-time 
computational problems. 

The Non-Strict computational approach seems promising as a first step towards a frame- 
work for specifying real-time robot dynamics computations in a machine-independent man- 
ner. It is current practice to choose the model/algorithm with a view to what is imple- 
mentable to provide real-time response when executed on a given target machine. The 
flexible and adaptive scheduling of tasks inherent with the Non-Strict approach suggests 
that a single algorithmic representation can be adequate for slow and fast processors alike, 
for single as well as multi- processors. This seems quite feasible especially if computation- 
ally inexpensive predictors/extrapolators are used as alternatives in conjunction with more 
accurate but computationally demanding task blocks. A powerful target machine might 
schedule the computationally demanding blocks every time needed and thus obtain great 
accuracy, whereas a weaker target machine might schedule those computationally intensive 
blocks relatively infrequently and use the predictors in between, thereby trading off accuracy 
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for timeliness. 


Work is currently in progress in evaluating the Non-Strict approach with more complex 
robot dynamics computations, and in investigating the use of predictors and multi-rate task 
scheduling strategies. A testbed is also under development on a BBN Butterfly multiproces- 
sor to investigate Non-Strict scheduling in a parallel setting. It is hoped that with further 
work the Non-Strict approach can be established as an effective approach for complex real- 
time domains such as robotics control. 
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Abstract 

A fundamental issue which directly impacts the scalability of current 
theoretical neural network models to massively parallel embodiments, in 
both software as well as hardware, is the inherent and unavoidable 
concurrent asynchronicity of emerging fine-grained computational 
ensembles and the possible emergence of chaotic manifestations. Previous 
analyses attributed dynamical instability to the topology of the 
interconnection matrix, to parasitic components or to propagation delays. 
However, we have observed the existence of "emergent" computational 
chaos in a "concurrently asynchronous" framework, independent of the 
network topology. In this paper we present a methodology enabling the 
effective asynchronous operation of large-scale neural networks. Necessary 
and sufficient conditions guaranteeing concurrent asynchronous 
convergence are established in terms of contracting operators. Lyapunov 
exponents are computed formally to characterize the underlying nonlinear 
dynamics. Simulation results are presented to illustrate network 
convergence to the correct results, even in the presence of "large" delays. 
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Abstract 

This paper presents a multi-flexible-body dynamics formulation 
incorporating a recently developed theory for capturing motion induced 
stiffness for an arbitrary structure undergoing large rotation and 
translation accompanied by small vibrations. In essence, the method 
consists of correcting prematurely linearized dynamical equations for an 
arbitrary flexible body with generalized active forces due to geometric 
stiffness corresponding to a system of twelve inertia forces and nine inertia 
couples distributed over the body. Equations of motion are derived by 
means of Kane s method. A useful feature of the formulation is its 
treatment of prescribed motions and interaction forces. Results of 
simulations of motions of three flexible spacecraft, involving stiffening 
during spinup motion, dynamic buckling, and a repositioning maneuver, 
demonstrate the validity and generality of the theory. 
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ABSTRACT 

This paper considers dynamics of chains of flexible bodies undergoing large rigid body 
motions, but small elastic deflections. The role of nonlinear strain-displacement relations in the 
development of the motion equations correct to first order in elastic deflections is investigated. The 
general form of these equations linearized only in the small elastic deflections is presented, and the 
relative significance of various nonlinear terms is studied both analytically and through the use of 
numerical simulations. Numerical simulations are performed for a two link chain constrained to 
move in the plane, subject to hinge torques. Each link is modeled as a thin beam. Slew maneuver 
simulation results are compared for models with and without properly modeled kinematics of 
deformation. The goal of this case study is to quantify the importance of the terms in the equations 
of motion which arise from the inclusion of nonlinear strain-displacement relations. It is concluded 
that unless the consistently linearized equations in elastic deflections and speeds are available and 
necessary, the inconsistently (prematurely) linearized equations should be replaced in all cases by 
"ruthlessly" linearized equations: equations in which all nonlinear terms involving the elastic 
deflections and speeds are ignored. 

1 INTRODUCTION 

In recent years a fundamental limitation of the finite element formulation of flexible 
deformations in flexible multibody simulation programs such as TREETOPS[l], DISCOS [2], etc. 
has been pointed out [3,4,5]. This limitation could be characterized as a premature linearization of 
velocity expressions that is implicit in a linear finite element or modal formulation of the motion 
equations for flexible bodies [4]. Kane et al. [6] demonstrated this flaw of such an approach 
numerically by simulating a simple system consisting of a flexible beam attached to a rigid base 
spinning in the plane. This simulation yielded the surprising and intuitively wrong result of the beam 
diverging during a spin up maneuver [3]. Probably because of this simple example the "prematurely 
linearized" equations of motion are said to lack the "spin-stiffening effect." Further study of this 
simple system showed that this limitation of the traditional approach does not significantly affect 
simulation results for some maneuvers typical of space dynamical systems, such as repositioning 
slewing maneuvers and station keeping [4]. 

This paper presents a general discussion of the equations of motion of a flexible multibody 
system undergoing motion with large rigid body rates (not just angular velocities) and rigid body 
configuration changes, but with infinitesimal elastic deformations. The generic equations of motion 
for such systems are presented; linearized only in elastic deformation amplitudes. The implicit 
assumption here is that this is a useful set of motion equations: nonlinear in rigid body rates and 
coordinates but linear in flexible coordinates and rates. Some of the terms in these motion equations 
cannot be obtained with linear kinematics of elastic deformation (i.e., the traditional linear finite 
element or modal formulation). This paper illuminates the form of these practically unobtainable (in 
the general case) terms, evaluates their relative importance and examines the possible consistent 
simplifications of the motion equations. 
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The practical impact of these simplifications is investigated with the use of two case studies. 
The consistent equations of motion, derived with nonlinear strain-displacement relations, are 
explicitly given for two systems: 1) a single Bemoulli-Euler beam cantilevered to a rigid base 
undergoing large rigid body, but small flexible motions in the plane; and, 2) a two-link, revolute, 
planar, flexible manipulator consisting of three rigid bodies (shoulder, elbow and end effector) 
connected by two Bemoulli-Euler beams. These two examples will serve to emphasize the general 
discussion and to quantify the magnitudes of the neglected and retained terms in the equations of 
motion. This will be done both analytically and through comparison of simulation results. 

Before proceeding to a more general discussion of the "correct" linearized equations of 
motion, a brief explanation of how these equations are obtained through proper linearization, and of 
the role of nonlinear strain-displacement relations is in order. 


2 NONLINEAR STRAIN-DISPLACEMENT AND PROPER LINEARIZATION 

The problem that concerns us is that of obtaining the correctly linearized equations of motion 
for an important class of systems which exhibit large rigid body motions but small elastic deflections. 
By far the most common practice to date is to handle the flexibility through discretization of the 
desired continuous system. This is achieved by representing the solution as a finite series of time- 
dependent generalized elastic coordinates multiplied by space-dependent functions, as in an assumed 
modes approach, or in a finite element formulation [7]. Whichever formulation one uses, in light of 
the class of systems under study, the next step is to assume that these elastic coordinates, together 
with the generalized elastic speeds, are infinitesimally small. In other words, we assume these 
coordinates and speeds to be small enough so that only terms linear in them are kept in the equations 
of motion, as terms of second order or higher are negligible. 

Now that our goal is clearly stated, it should be an easy matter to obtain the linearized 
equations of motion as long as we consistently drop all terms nonlinear in the elastic coordinates and 
the corresponding generalized elastic speeds. Of course, the word "consistently" is the catch. When 
do we linearize? That is to say: Does it matter at what step in our derivation of the equations of 
motion we start to linearize? To answer this question we have to consider the process by which we 
derive these equations. 

Let us consider two of the more widely known methods to derive motion equations for 
complex systems: Lagrange's equations of motion [8] and Kane’s dynamical equations [9]. 
Lagrange's equations for a holonomic system with n generalized coordinates <?*: 


d ( dL\ dL n 


( k 


( 1 ) 


L = T -V 

where the Lagrangian L is a function of the system kinetic and potential energies (T and V 
respectively). Qk are n generalized non-potential forces. The important thing to note is that using 
this method to derive motion equations requires differentiating both the potential and kinetic energies 
of the system with respect to the generalized coordinates and speeds. If these qi and u,- (=dqjdt) 
were to represent our generalized elastic coordinates, we see that the above differentiations imply that 
some terms linear in qi and u t in the energy expressions become terms of zeroth order in qi and u,. 
More importantly, we see that terms of second order in the generalized coordinates and speeds in the 
energy expressions become terms of first order in the resulting equations of motion. Clearly then in 
order to obtain equations of motion correct to first order in q t and m, we need to have energy 
expressions for our system that are correct to second order in these same elastic generalized 
coordinates and speeds. More specifically the requirement demands in general that the expressions 
for displacements and velocities used in determining potential and kinetic energies be correct to 
second order in the elastic coordinates and speeds. Only by doing this can we ensure consistent 
linearization. 
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Kane's dynamical equations for a holonomic system of n particles with n generalized speeds 


us 


F r + F'= 0 

C r = l,...,/t) 

K= i y; • R* . 

i = l 

R . = - m. a. 

i i i 

P . " P. 

v * = It/h. + v , 
i = l 1 

II 


( 2 ) 


where F r is the generalized active force, F* is the generalized inertia force, R,* is the inertia force for 
particle P,- in an inertial reference frame, and v p i is the velocity of this particle in the same frame. v r p i 
is the r-th partial velocity of particle P, in the inertial frame. Using a similar argument as above, it is 
easy to see that since the partial velocity has to be correct to first order in the generalized coordinates 
and speeds, and again this term is obtained through differentiation of the velocity with respect to the 
generalized speeds, the velocity has to be correct to second order in qi and u, until we form the partial 
velocities. This is necessary if we want our equations of motion to be consistently linear in the 
generalized coordinates and speeds. 


3 FORM OF THE EQUATIONS OF MOTION FOR A CHAIN OF ELASTIC 
BODIES 

The equations of motion of an open chain of elastic bodies can be expressed quite generally 

as [10]: 


^ rr (X yC[ ) 

Mre (x y q) - - 

1 T C,R 


rr i 

ext,R 

^ ER ^ ^ ) 

M^iXyq) JU 

J = L r c. £ . 

+ 

T 

1 ext, EJ 


F r (x ,qjc,u) 

L F £ (x,q,x, u) f 


u = q 


( 1 ) 


where x is a vector of rigid body generalized coordinates; q is a vector of the elastic generalized 
coordinates; Mrr, Mre, Mrr , Mee form the configuration-dependent mass matrix; Tq is a vector of 
control forces (as in joint-torque actuators in a manipulator); is a vector of other generalized 
external forces; Kre is a constant stiffness matrix (see equation (2) below) and F is a vector of 
nonlinear inertial (coriolis and centripetal) forces. 

We are often interested in the important class of systems for which the elastic deformations 
remain small so we can ignore terms of second order in q and u. Strictly speaking this requires that 
ll< 7 ll and Hull be infinitesimally small. However we know that if, for example, our flexible body is 
modelled as a beam, it is sufficient that the elastic deformations do not exceed one tenth of the length 
of the beam in order to use linear Bemoulli-Euler beam theory. At any rate, given this assumption of 
small elastic deflections, we could expand the previous equation in order to show more explicitly the 
form of the nonlinear terms: 


M RR (x,q ) M R£ (x,q) T x 1_ t c,r + ^exi.R 
M^ix ,q) (x , q) V k J t c ,e _ + J e xt,E 
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* x , 
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(M; r (x,q)) =n? (x)q, (M' R (x,q)) = n? Tlj {x)q 

ij V (2) 

where n above is the number of rigid body coordinates; fuj is a column matrix, but/^y and .ft; are 
n+m by m matrices, where m is the number of elastic coordinates. 

In light of the discussion in the previous section, we could further write: 

v*>= 4-CO+4-U) 

m Uj (x ) = m\.j(x ) + ) 

m 2 ij (x)= m\ u (x)+ m 2 2 ij (x) (3) 

where the superscript 2 terms can only be obtained through the use of displacement and velocity 
expressions, in the development of the equations of motion, that are accurate to second order in the 
elastic generalized coordinates and speeds ( q and «). This requires the use of nonlinear strain- 
displacement relations [11,12], nonlinear kinematic constraints [6,4], or the use of a nonlinear 
"geometric stiffening" term appended to the incorrectly linearized equations of motion [5,13]. 

The complexity of equations (2) and the difficulty involved in obtaining the nonlinear terms 
have prompted attempts at simplification. It is common [10] for example to assume small velocities 
and drop all terms nonlinear in rates. This results in rate-linear motion equations that greatly simplify 
the dynamicist’s task. It has been pointed out [14], however, that in the case of n-link rigid 
manipulators in any configuration, the velocity and acceleration terms of the dynamic equations have 
the same relative significance at any speed of movement. The fact that the omission of these terms 
does not significantly affect simulation results is attributed to the fact that gravity and joint friction 
usually overpower inertial terms. These results have not been extended to chains of flexible bodies. 
One might argue that in some limit (i.e., vanishingly small q) the equations of motion of the flexible 
multibody system should reduce to those of the rigid multibody system. Then it seems that a good 
case could be made for the inclusion of at least nonlinear terms in the rigid body rates in our rate- 
linear equations (i.e.,/i,y(jc)), particularly considering the fact that future, fast, space manipulators 
with low joint frictions are part of the class of systems under consideration. 

Faced with this, we can proceed in two ways with respect to the equations of motion: we can 
be consistent, or we can be selective. To be consistent requires keeping all terms of order lUyll and Hull 
in equation (2), i.e., no simplification. We also encounter the problem that the superscript 2 terms in 
equations (3) are not readily available in the general case since they depend on nonlinear elastic 
theory or nonlinear kinematics of deformation for their derivation. We could just make do with the 
superscript 1 terms in equations (3) (standard approach) but this would not be consistent nor 
justifiable since there is no a priori reason to guarantee lift# 1 II » l|fty 2 ll, for example. 

We are forced then to be selective, at least in the general case. Now we have to rely mostly 
on experience and simulation to determine which terms are important and which negligible under 
given conditions. Using the simple example of a beam radially cantilevered to a spinning hub, an 
empirical speed limit has been proposed beyond which the standard linear finite element or modal 
formulations of the model give erroneous results [5]. This limit is specified as follows: the 
magnitude of the spinning rate of the system has to be one order of magnitude less than the 
fundamental bending frequency of the beam. We use this simulation result to claim the following: 
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unless the equations of motion exact to first order in elastic generalized coordinates and speeds are 
available, we are limited to rigid body angular rates less than an order of magnitude lower than the 
lowest fundamental bending frequency of our system. In view of this, we might be able to model 
our system accurately enough by just keeping the rate-linear equations together with terms f U j. In 
other words, we might as well drop all nonlinear terms involving elastic coordinates and speeds. We 
further claim that speed or acceleration limits also exist in translational rates or accelerations, arising 
from those mass matrix terms that depend on q and cannot be obtained through the standard 
approaches. In the next section we investigate these claims analytically and through simulation. 

In what follows we shall refer to three types of models for a flexible multibody system under 
study. The "consistent" model will be that which retains all terms to first order in q and u, that is, the 
consistently linearized model. The "inconsistent" model shall be that obtained through linear 
kinematics of deformation, that is, one whose equations omit the superscript 2 terms mentioned 
above. Finally, a "ruthlessly linearized," or simply "ruthless" model, shall be one in which the 
nonlinear terms which include elastic coordinates and speeds are ignored, including those terms in 
the mass matrix which depend on elastic coordinates. In other words, in our "ruthless" model 
equations we ignore terms fay and fn, and we assume the mass matrix depends on rigid body 
configuration only. 



Fig. 1: Single Beam Attached 
to a Moving Base 

4 Two Case Studies 



Fig. 2: Fundamental Bending 
Frequency of Spinning Beam vs Spin Rate 


4.1 One flexible body example 

Consider a simple, slender, uniform beam cantilevered to a rigid body free to move in the 
plane (see Fig. 1). The frame N, defined by the unit vectors ni, 112, 113, is inertial, and we 
introduce the rotating frame A defined by the unit vectors ai, a2, a3, attached to body A and whose 
ai axis lies along the undeformed neutral axis of the beam B initially. The consistently linearized 
equations of motion for this system have been derived using Kane's dynamical equations together 
with nonlinear strain-displacement relations. Shear and rotary inertia effects have been ignored (i.e., 
slender beam assumption). The equations of motion, exact to first-order in generalized elastic 
coordinates and speeds are [ 15 ]: 
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where following Kane et al. [16] we have defined: 
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In deriving the above equations, we have assumed no external forces act on the beam for simplicity. 
mji is the mass of the rigid body A; vi and V 2 are translational speeds of body A in the directions of 
ai and a 2 respectively; V 3 is the rotational speed of body A; < 7 , are the n generalized elastic 
coordinates, where we have discretized the transverse elastic deformation of the beam using assumed 
modes. 

Specializing equation (1) to the prescribed motion of uniform rotation of the base, 

vi = v 2 = 0, V3=f2=constant 


we get: 


'LG.q. 
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Note that the terms Pij and rjy cannot be obtained using linear kinematics of deformation but are 
obtained through the use of nonlinear strain-displacement relations. The term 

tffrp.. + Tj.j) 

is known as the geometric stiffness matrix for this specialized rigid body motion (rotation). It is easy 
to observe from just this analytical study that in the absence of these terms our equations lose 

stiffness with increasing angular rates Q, since: 
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We note that for a variety of mode shapes [17]: 
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and it is clear that as long as £2 « (Oi the incorrect de-stiffening effect will not be apparent. This 
clearly evidences the angular speed limit mentioned at the end of the previous section. As shall be 
seen, this rotational rigid body rate limit is the most restrictive on die validity of other than the 
correctly linearized equations for the maneuvers considered. Figure 2 shows the first bending 

eigenfrequency as a function of £ 2 , as predicted by the three modelling approaches, each using the 
same assumed modes. 

If we now specialize equation ( 1 ) to the prescribed motion of constant translational 
acceleration: 

du\/dt = g = constant, V2 = V3 = 0 

we obtain: 

^ G ifli + X (J^ij ij ) < 7 , = 0 

i -1 1=1 

^ = 1 (5) 

Hij is in this case the operative geometric stiffness matrix. Now we can see that for g large enough, 
we again obtain de-stiffening. Another way of looking at it is to realize that for g large enough the 
stiffness matrix becomes non-positive definite which implies that the beam buckles due to its own 
weight. The predicted buckling is correct, and would have been lost with the inconsistently or the 
ruthlessly linearized approaches. This suggests that a translational rate or acceleration limit also 
exists beyond which our model is again grossly incorrect if we do not use equations exact to first 
order. Inspection of equation ( 1 ) suggests that another rate limit exists, this one on the product of 
rigid body rotational and translational rates, V2V3. Baneijee’s recent work [ 13 ] suggests that as many 
as 12 such independent limits on rigid body motion exist and must be considered for general three- 
dimensional motion. It is doubtful that for typical stop-to stop slew and repositioning maneuvers 
these limits become operative, since the magnitude of the applied forces is limited by the requirement 
that the flexible body not deform excessively. This anticipation motivates the case study of the next 
section. 

4.2 Two-link flexible arm example 

Consider a two-link flexible, revolute arm composed of three rigid bodies connected by two 
slender uniform beams (see Fig. 3 ). The equations of motion for this system were derived using 
Kane’s dynamical equations together with nonlinear strain-displacement relations. The equations are 
thus exact to first order in the beams’ elastic generalized coordinates and speeds. We ignore 
independent axial extensions of the beams (i.e., the axial strain at the neutral axis is assumed zero for 
each beam). The elbow joint is actually modelled as two bodies: one attached to link 1 in a 
cantilevered way and the other, free to rotate with respect to the first, with link 2 attached to it in a 
cantilevered way also. Both elbow bodies are rigid and share the hinge point but their mass centers 
are allowed to be offset from the hinge point. As in the previous examples, the continuous 
transverse displacements of the beams are discretized using an assumed modes approach. One 
percent modal damping is added to the model to represent structural damping. Actuation is assumed 
in the form of shoulder and elbow torques. The equations of motion for this system, which are quite 
extensive, are available in [12]. 

An examination of the equations of motion for this system makes it clear that the complexity 
of the mass matrix and nonlinear inertial terms could be diminished immensely by dropping most 
terms involving the generalized elastic coordinates and speeds (i.e., the "ruthless" case). It would be 
advantageous to know, then, if these terms have a significant effect on the dynamics of a chain of 
elastic bodies if we are limited to the low rotational speeds and translational accelerations encountered 
during slew maneuvers with joint torques limited by the requirement that flexible deflections remain 
small. In order to investigate this, we propose to examine the three models mentioned at the end of 
section three, to note, consistent, inconsistent, and ruthless, as applied to the two link arm. The 
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complexity of the motion equations precludes an analytical study akin to the one presented in the 
previous section. We will therefore rely entirely on numerical simulation and compare the 
performance of the three "models" of the arm when it is subjected to a smooth slew maneuver. 

5 numerical Simulation 

The motion equations for the two-link, flexible, planar manipulator, consistently linearized in 
small elastic deflections and speeds, were programmed in FORTRAN and implemented in a 
VAXstation 2000. The code allows for a maximum of four cantilevered modes per beam. The mass 
matrix is inverted using LU decomposition and a Runge-Kutta fourth-order scheme with adaptive 
time step is utilized for the time integration [18]. Energy and angular momentum checks are built into 
the simulation. Table 1 shows the physical properties assumed for the arm (see also Fig. 3). These 
numbers were chosen to mimic an actual experimental testbed built at Martin Marietta by Dr. Eric 
Schmitz [19]. 


Physical Properties of Planar Manipulator with Two Flexible Links 

Mass of shoulder body (kg) 

20.0 

Length of link 1 (m) 

0.9144 

Mass density of link 1 (kg/m) 

1.33937 

Length of link 2 (m) 

0.9144 

Mass of elbow body (kg) 

14.0 



Mass density of link 2 (kg/m) 

0.669685 

Other lengths (see Fig. 3): 

Mass of tip body (kg) 

2.0 

bi (m) 

0.0762 



b 2 i (m) 

0.0762 

Moments of Inertia (about axis perpendicular to plane): b22 (m) 

0.0127 

Shoulder body (kgm 2 ) 

0.01 

b t (m) 

0.0508 

Elbow body (kgm 2 ) 

0.03 



Tip body (kgm 2 ) 

0.01 




Table 1: Physical Properties of the Two-link Manipulator 


All the trajectories presented below were run in open loop after the torques had been 
computed from the inverse dynamics problem (given the desired angular trajectories) assuming rigid 
links for the manipulator. For all simulation runs, only two assumed modes per link were used, 
since this gave adequate results and was computationally much cheaper than running the full four 
modes per link. With two modes per link, the first two system vibration frequencies were obtained 
to within three percent of the value obtained using four modes per link. 

In the following, reference is made to time-scaled trajectories. Time-scaling of nominal 
trajectories [20] is achieved by replacing time as the independent variable by the new variable 

r = at, a > 0 

where a is a constant. When a is greater than one, the trajectory is sped up, while if a is less than 
one it is slowed down. From this it is apparent that rates scale like a, while accelerations, and thus 

torques, scale like the square of a. These relations are used in the following sections to select a 
scaling factor that yields a desired maximum value of angular rate, or a given maximum value of 
torque to obtain desired maximum link tip deflections. 

5.1 A Smooth Slew Maneuver 

In section 4.1 it was predicted that the more severe limit on the validity of other than 
consistently linearized equations would be the limit on rigid body angular rates. In the case of chains 
of flexible bodies, as exemplified by the two-link manipulator in this simulation, the nonlinearity 
arising from dependence on configuration makes it difficult to select a "characteristic" angular rate in 
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the general case. This suggests a case by case approach. Several different slews were compared in 
reference [12]. We report here only one, perhaps the most interesting. 

The smooth trajectory is obtained by assuming a form of the joint angular time histories of the 
equivalent rigid manipulator that is quintic in time. This allows the specification of angle, angular 
rate and angular acceleration at the initial and final times of the trajectories [21]. This trajectory is a 
deployment maneuver. Both links undergo significant rotational motion, and the outboard link 
translates. Fig. 4 shows the computed torques for the nominal trajectory. 



Fig. 3: Schematic of two-link 
Planar Manipulator 


Fig. 4: Computed Torques for 
Second Smooth Trajectory (Nm) 


Figure 5 shows the nominal trajectory. In this figure, plots for both the ruthless and the 
consistent models have been overlaid. The inconsistent model fails for this case. The failure is a 
numerical divergence during time integration. This is perhaps related to the fact that the angular rates 
for the nominal maneuver are as large as thirty percent of the "fundamental vibration frequency." 
The system frequencies for the manipulator with locked joints are seen to fluctuate from 3 rad/sec to 
4 rad/sec for corresponding elbow relative angles of zero to 135 degrees [12]. For this reason, in the 
case at hand reference is made to "one" fundamental frequency, and it is assumed that it lies in the 
range specified above and is about 3.5 rad/sec. For the two models shown, the agreement is again 
excellent for the shoulder angle and tip deflections, with the elbow angular position being off by only 
a maxim um of ten percent relative error. 


Convergence of all three models is achieved if the nominal maneuver is slowed down 


(a=0.1685). Figure 6 shows that for this case, all three models yield identical results. This 
confirms the predictions in section 4.1, and further suggests that within the limit of validity of the 
inconsistent model, the ruthless is as good as the inconsistent, and actually better since it is much 
easier to obtain. Note that the angular rates are well within ten percent of the fundamental. 


The last case considered consists of the nominal trajectory scaled upwards in time ( 0=1.2). 
As in the previous section, it was desired to reach the "hard" simulation limit of link tip deflections of 
about ten percent of the link lengths. As Figure 7 shows, only the ruthless model did not fail under 
the given speed-up of the trajectory, even though tip deflections should oily be about four percent of 
link lengths. This divergence is traceable to a near singularity of the mass matrix, related to the fact 
that the links are modeled with distributed mass, while they are in fact nearly "massless springs." 
The man ipulator mass distribution is dominated by the elbow and the tip mass. Reference [12] 
reports that the ratio of the largest to smallest mass matrix singular values is on the order of 10 6 , and 
that this range is configuration dependent It is not known why this numerical ill-conditioning of the 
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mass matrix appears to be exacerbated by including dependence upon elastic deformation, as in the 
consistent model. 

In summary, the above results again show a strong correlation between the limit of validity of 
the inconsistent model and the maximum values of angular rates. In this case, however, no 
"characteristic" rigid body rate is apparent. The limit at which the inconsistent model fails seems to 
be even before any of the two angular rates (shoulder or elbow) reach ten percent of the fundamental 
vibration frequency. A strong point can still be made, nevertheless, in that the ruthless model is as 
good as the inconsistent whenever the inconsistent is valid, and more conservative since the ruthless 
model does not fail. Even at high angular rates the ruthless model yields results that are 
quantitatively very close to the consistent model results. 

Finally, it is worth pointing out the excellent agreement in the tip deflections for both links, in 
all cases. This is probably due to the fact that the equations of motion (see [12]) are elastically 
decoupled, and, while inertially coupled, the elastic degrees of freedom mass matrix (Mee of section 
3) does not depend on elastic nonlinear terms due to linearization. Also, the agreement in shoulder 
angle and angular rates is remarkable. This indicates that, depending on what state variable is of 
interest in a given trajectory, the ruthless model will be as good as the more cumbersome consistent 
model. In all cases, the ruthless is more conservative and better conditioned than the inconsistent 
model. 

5.2 Other Trajectories 

The above qualitative results for the smooth slew maneuver have also been confirmed for two 
other maneuvers: a smooth maneuver in which the elbow motion is kept to a minimum and the two- 
link manipulator is slewed in extended configuration like a "beam"; and a time-optimal, bang-bang 
control slew maneuver where joint torques were assumed limited. Details of these results, together 
with an analysis of numerical considerations, can be found in [12]. 

6 SUMMARY AND CONCLUSIONS 

Having looked into the general form of the linearized dynamics equations for chains of 
flexible bodies undergoing large rigid body motions, but small elastic deflections, we concluded that 
some terms cannot be obtained through the use of linear strain-displacement relations. These terms 
were seen to be critical in the simple rotating beam example as they provide the geometric stiffness 
terms necessary to obtain physical results. The absence of these terms in inconsistently linearized 
equations limits their validity to relatively gentle rigid body motions. The fact that these terms are 
unobtainable for the general case of an arbitrary flexible body led us to consider possible 
simplifications of the general motion equations, consistent with such restrictions on their 
applicability. 

The two alternative models studied, the ruthlessly linearized model and the inconsistent 
model, are subject to several limits in applicability. While the consistent model requires we keep 
elastic coordinates and speeds small, the two alternative models will only be accurate if we further 
maintain low rigid body angular rates. There also exists some translational acceleration or speed limit 
that needs to be considered, although for the cases studied this limit was of no consequence. Within 
the domain of validity of both simplified models, it appears the ruthless model yields results as 
accurate as the correct consistently linearized model. In addition, preliminary results promise that the 
ruthless model will result in large reductions in computational time in the simulation of large flexible 
multibody systems. This coupled to the simplification of the dynamicist's task inherent in the 
adoption of ruthlessly linearized models makes this option an attractive alternative. 

From the above it is clear that the inconsistent model should never be used. Further, the 
more cumbersome and hard to obtain consistent model should only be used when necessaiy (i.e., 
when the domain of validity of the simplified models is exceeded). Finally, it is our strong belief that 
the ruthless model deserves widespread use. 

Simulation misbehavior at certain trajectories for relatively high rigid body angular rates was 
tracked down to numerical ill-conditioning of the configuration dependent mass matrix. This 
problem was attributed to modelling error inherent in choosing cantilever (clamped-free) modes to 
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model the flexible deflections of the manipulator links. In ref. [12] it is suggested that this results in 
effectively modelling one (or more) massless degrees of freedom. Thus it became apparent that 
physical modelling of bodies, and adequate selection of assumed modes and numerical procedures, 
can be as important as sensible simplifications of the motion equations. 
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Abstract 

A total Lagrangian finite element formulation for the deformable bodies 
in multibody mechanical systems that undergo finite relative rotations is 
developed. The deformable bodies are discretized using finite element 
methods. The shape functions that are used to describe the displacement 
field are required to include the rigid body modes that describe only large 
translational displacements. This does not impose any limitations on the 
technique because most commonly used shape functions satisfy this 
requirement. The configuration of an element is defined using four sets of 
coordinate systems: Body, Element, Intermediate element, Global. The 
body coordinate system serves as a unique standard for the assembly of 
the elements forming the deformable body. The element coordinate system 
is rigidly attached to the element and therefore it translates and rotates 
with the element. The intermediate element coordinate system, whose axes 
are initially parallel to the element axes, has an origin which is rigidly 
attached to the origin of the body coordinate system and is used to 
conveniently describe the configuration of the element in undeformed state 
with respect to the body coordinate system. A mixed sets of Cartesian 
translational and rotational coordinates are used in order to define the 
location and orientation of the deformable body coordinate system with 
respect to global inertial frame of reference. The nonlinear dynamic 
equations of motion developed, for deformable multibody systems that 
undergo large relative displacements, are expressed in terms of a unique 
set of time-used. These invariants can be generated for each finite element 
prior to dynamic analysis. The invariants of the deformable body can be 
obtained by assembling the invariants of the elements using a standard 
finite element processor. The nonlinear formulation presented in this paper 
has been implemented in the general purpose computer program DAMS 
(Dynamic Analysis of Multibody Systems) that automatically constructs and 
numerically solves the nonlinear equations of motion of multibody systems 
consisting of interconnected rigid and deformable bodies. The linearization 
used in other finite element methods (such as the updated Lagrangian 
formulation) for describing the large displacements of deformable bodies is 
also discussed. 
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Abstract 

It is common practice in the analysis of constrained multibody systems 
to apply the constraints as a set of separate algebraic equations and embed 
them into the governing equations for the specified time of the simulation. 
However more realistic systems are those where different constraints could 
be applied at different times, and hence the multibody system must be able 
to accommodate for the sudden changes. If the multibody system doesn't 
develop the appropriate initial conditions to satisfy the constraints 
whenever they are applied, the system performance will then be hampered 
and it will fail to accomplish its tasks. 

If a multibody system is subjected to constraint equations that are 
released and applied at different times during the motion they characterize 
the so-called intermittent constraints. This paper objective is to address 
the continuity of motion when a dynamical system is suddenly subjected to 
constraint conditions. Motion discontinuity due to the initial constraint 
violation is avoided by prior control forces that adjust the motion and yield 
velocity and acceleration consistent at the point of application of the 
constraint. The optimum control forces are determined for a specified 
control interval. The method proposed provides an optimum adjustment of 
the system's motion and assures that the stresses developed at the system 
components are kept within acceptable limits. The procedures developed 
will be illustrated making use of inequality constraints applied to obstacle 
avoidance problems in robotics. 


*PhD, Member ASME 
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Abstract 

Research in robot control has generated Interests in computationally 
efficient forms of dynamic equations for multi-body systems. For a simply 
connected open-loop linkage, dynamic equations arranged in recursive form has 
been found to be particularly efficient. A general computer program capable of 
simulating open-loop manipulator with arbitrary number of links has been 
developed based on an efficient recursive form of Kane’s dynamic equations. 
Also included in the program is some of the important dynamics of the Joint 
drive system, i.e., the rotational effect of the motor rotors. Further 
efficiency is achieved by the use of symbolic manipulation program to generate 
the Fortran simulation program tailored for a specific manipulator based on 
the parameter values given. This paper describes the formulations and the 
validation of the program, and it also shows some results. 

Introduction 

In the development of a robotic manipulator, simulation program can be an 
important design tool. It caui be used to support detailed mechanical design by 
revealing the constraint forces and torques at different locations during 
certain maneuvers. It can also be applied to test different control laws 
without concerns of damaging the actual manipulators. If real time simulation 
can be developed, training of telerobot operators and testing of actual 
control hardware and softwaire can become possible. 

The success of a simulation in providing the useful and accurate 
information depends on the model fidelity, the formulation of equations of 
motion and the numerical solution of the equations. There is no such thing as 
"the" simulation of a dynamical system because the fidelity of the model 
determines what the results are like. There is always room for higher fidelity 
and so there is no end to it. But quite often, a modest increase of model 
fidelity is accompanied by a significant increase In equation complexity and 
numerical difficulty, and thus computation time. To achieve reasonable 
efficiency in the computation, one has to investigate the merits of different 
solution algorithms, different dynaunical formulations and different levels of 
model fidelity. Additionally, one has to validate that the program is 
correctly representing the model. 

Many researchers have worked on efficient formulations of dynamic 
equations for robot manipulators [2-9] . Most of them model robot as consisting 
of rigid bodies connected together with revolute or translational joints. 
Details of the joint drive systems have been mostly ignored. It is shown in 
[7] that joint drive systems have potentially significant effects on robot 
dynamics and hence should be included in the model. Also shown in [7] is a 
procedure to obtain the dynamical equations of a robot with a speed-reduction 
drive system from the equations of a direct drive robot. This procedure will 
be followed to develop a more comprehensive robot simulation program. 
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It has been known [2,5,6] that the important aspect of efficient 
formulations is the recursive development of kinematic and dynamic quantities 
to reduce the number of transformations among vector bases. It is also known 
that recursive Lagrange’s formulation is still less efficient than the 
recursive Newton-Euler’ s formulations. However, Newton-Euler’ s formulation 
will not be advantageous if more complicated model of the system is analyzed. 
Since the program under development is anticipated to be expanded for more 
comprehensive modeling of manipulator systems, Kane's method is chosen because 
of its systematic features. An efficient formulation has been developed by 
applying recursive schemes in Kane’s equations for a general manipulator 
system. The forward and backward recursions are established based on the 
bounds on the summation signs in the equations. 

If properly developed, it is expected that a customized simulation 
program for a particular manipulator should be more efficient than a general 
purpose simulation program. For the development of simulation program, there 
is always a trade-off between generality and efficiency. But through the 
application of symbolic manipulation to eliminate unnecessary computations 
that occur for a particular model, it is possible to improve simultaneously 
the generality and the efficiency of a simulation program. Symbolic 
manipulation language MACSYMA has been used to develop a program called MSP 
(Manipulator Simulation Program) for manipulators that are made up of a single 
chain of any number of rigid bodies connected by revolute joints. Gear 
reduction effects of some simple joint drive systems sure also efficiently 
incorporated in the program following the procedure in [7]. 

Independent formulation and programming of the system kinetic energy and 
the system angular momentum about a base-fixed point on the 1st Joint axis are 
developed for validation purposes. Test cases which involve conservation of 
these quantities have been selected to validate the simulation programs. The 
objective of this paper is to present the formulation involved in the 
development of this program. Computation efficiency and significance of gear 
reduction effect are also to be discussed. 

Mathematic Model 

An open chain manipulator with N degrees of freedom as shown in Fig. 1 is 
analyzed for the development of MSP. Each link is driven with a motor and a 
gear reduction mechanism, an example of which is shown in Fig. 2. The base is 
considered fixed in the earth E (assumed to be an inertial reference frame). ^ 
Couples are generated at motors through electromagnetic intersu:t ions, suid gear 
reductions amplify the resulted moments on the links about the Joints. It is 
assumed that the motor rotor and its rigidly attached part is the only massive 
element in a joint drive system that will contribute to the modifications of 
the equations of motion from that of a multibody direct drive system. 

The links are labeled consecutively to B^ starting from the link 

connected to the base. The base is referred to as link B 0 - The ideal revolute 
joints between links are numbered such that joint i connects link B. to link 
B An orthogonal unit vector basis x., y. and z. fixed in B. is defined in 

i -1 iii i 

such a way that the unit vector z (i = 1 N) is directed along the axis 

of joint i. A particular configuration called the null configuration of a 
manipulator is one in which relative joint angles between links are all equal 
to zeros. The joint angles q. (i = 1 N) are positive when right-handed 

rotation from the null configuration about z. occurs. In this paper, the motor 

driving link B. is assumed to be mounted on link B. . and unit vector e. is 

defined to be parallel to the rotation axis of the motor rotor. 
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Formulation of Dynamical Equation 


The following presentation of the formulations will be in terms of 
vectors and dyadics which are quantities independent of unit vector bases and 
can be represented by column and square matrices, respectively, when expressed 
in a particular basis. 


Kane’s dynamical equations[l] are 


F + F = 0 
r r 


(r = 1 N) (1) 


where F and F* are the generalized active and inertia forces associated with 
r r 

the r-th generalized speed, respectively. Since the system is holonomic, the 
number of generalized speeds are equal to the number of degrees of freedom. 
Here the generalized speeds are chosen to be simply the derivatives of the 
generalized coordinates. Assuming that the only contributing active forces are 
the motor torques T^, the gravitational forces and the external load on the 

last link, represented by a force F e acting through the mass center and a 
torque X** one can write 


F = \x T + 

r r r 


G) + V 


+ (a) 


(r = 1, N) (2) 


- 2 1 M.(v. • G. _ r _ r 

i-l i — i t r “N,r — N, r 

where a is the gear ratio for the r-th joint, M. is the mass of i-th link, G 

r i 

is the gravitational acceleration vector, and V. and w. are the r-th 

* i G ** 

partial velocity of B., mass center of B., and the r-th partial angular 
velocity of B., in E, respectively. The generalized inertia forces are due to 

the inertia force and torque associated with each link, and they are 

N N 

F* = -.E,M.(V. * a.)-. 2, w. •(!.* a.+ u.x I ♦ w ) (r = 1 N) 

r i *1 t — i,r “ i 1-1 “ l , r -1 - 1 ~i -1 “ l 

where a. and a. are the acceleration of B. and the angular acceleration of B 

“1 A 1 1 

in E, respectively, and 1. is the central inertia dyadic of B.. Therefore, 

the dynamic equations become 

N N 

.JL M. ( V. • a.) + u>. • (I.* a. + u> x 1 • w ) 

i^1 i — i , r —i i - 1 -i,r -i -i -i ”i 

N 

= p T + .£, M. (G • V. ) + V ♦ F e + to. 

'r r i-l i — — i , r , r 


(3) 


to. = 2, q. z. 

-i j-1 M j -j 

V. = -Si q (z.x r c .) 

-i j -1 j -j -ji 

10 . = z £ 1 

-i,r — r r 


V. = (z x r c .) 

— i , r — - — — - 

c 


*r ” n r 

* 

where r.. is the position vector from Q. to B. and 

— ji J i 

e \ _ f 1 if i i r 

S ~ I 0 if r > i 


i 


a. = .j 

-1 j*1 

q. z . + 

J -j 

i 


a. = .J 

-i j*1 

q. (z.x 

1 -j 


I* 

(r 

= 1 N) 

(4) 

equations. 



(i = 

i,... 

,N) 

(5) 

(i = 

i,... 

.N) 

(6) 

(i = 

i,... 

. N) 

(7) 

(i = 

i,. .. 

.N) 

(8) 

(i.r 

= i.. 

.. ,N) 

(9) 

d as 




(i = 

i,... 

. N) 

(10) 

(i = 

i,... 

,N) 

(11) 
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where 


a\ = -2« q.(w. x z.) 

“i j*l -j -j 

a* = .2 q . - 3 ir ( z . x r c . . ) 

* jVi ''j dt -j -j i 


( 12 ) 


i - 1 


= a’ x r c + (j. x(g>. x r?.) + q.z.xt-n: r L ..) (13) 

-i — ji -i -i —u j - 1 j j dt — ji 

and L” Is the position vector from to Q.. A left superscript on a time 

differentiation symbol represents the reference frame in which the 
differentiation is to be performed! 1 ] . With equations (5-13) substituted, the 
equations of motion can be rewritten as 

I. \< <*! - ".V f , (r * 1 N > <«> 

where 

Z • f , (15) 


A’ .= 
n 


- ri 


f .= .£.[ I. • z. + M. r c . x (z.x rf.) ] 
~n j*i - j -i j -pj -i -ij 


(i £ r) 


T = 
r 


I r = a,) * I"]} - I* -<l£x E*) 


a. = 


a; - G 


T* = 
— 1 


(16) 

(17) 

(18) 

(19) 

(20) 


I • a’. + w.x I .♦ a, 

— i -i -i =i 

Because of symmetry, only upper triangular terms of matrix [A’ .] need to be 

r i 

evaluated. The following formulations are used to evaluate the vector quantity 
f .. 

— r i 


f. = I. 

— i i -i 


z . 

— i 


(i = 1, N) 


( 21 ) 


I, =)?,< 1, ♦ - cjj cJi » 

= s, * I 2 <e« W) - - c !<i..) c i.r 

„ (i = N-l 1) (22) 

K. = 1. + M.[(r c .) 2 U - r c . r c .] + ( M. ) ( (r^ , ) 2 U - r L . . ,r L . 1 

-1 “I 1 *“11 “ —11 —11 J*1 + 1 J ”1(1+1) “ —1(1+1)— 1(1+1) 

N (i = 1 N) (23) 

r* =T. M. r c . 

-i jV, j -,j M 

- E:'., * H, * (£.,>(.) cj,,.,, (1 = N-l 2) (24) 

The dyadic quantity £. is a constant in B., i.e, if £ is expressed in terms 

of x., y., z. basis, the coefficients are constants. Equations (22) and (24) 

are backward recursive formulas that can be evaluated by establishing the 
following: 

I = I + M [ (r c ) 2 U - r c r c ] 

Am ±m V in - *-nm hiN 


= I 




N 

*“M “ ^-MM 


(25) 

(26) 


b m/°n 


where I is the inertia dyadic of relative to Q^. For the off-diagonal 
terms of [A ], a backward recursive formula involving r, i.e., 
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f .= f, + r , , x (z.x r.) 

— r i — (r+1)i — r(r+1) “i — i 


( 


i = N 


') 


( 27 ) 


r = i-l» . . . , 1 

can be used with f being the starting vector that should have been evaluated 
from equation (21). 

For T , the following forward recursive formulations can be used to 
evaluate the necessary quamtities. 
w. = t*>. + q.z. 

-i -1-1 i~i 

a* = a* + q . (w.x z . ) 

- 1 “i-1 i “i — i 

u. = a. - A. • r?. = u. „ + A. • r^. , x . 

—i —i =i ~“ii “i-l -i-l “(i-l)i 

where 

A. = a'.x U + (cj.x U)*(a>.x H) 

= i - 1 = - 1 - “i = 

The starting values for equations (28-30) are 
« 0 = 0 

Si = 0 

Uo = -G 

The introduction of dyadic quantity is to reduce the overall computation by 

reusing it in two equations in the remaining formulations. The dyadic obtained 
by cross multiplying a vector with a unit dyadic can be represented by a 
skewsymmetric square matrix when it is expressed in a particular unit vector 
basis. This skew symmetric square matrix is commonly encountered when a cross 
multiplication of column matrices is replaced with a matrix multiplication. 

The following backward recursive formulation can be used to evaluate the 


(i = 

1 ... 

. ,N) 

(28) 

(i = 

1 . . • 

...N) 

(29) 

(i = 

1 ... 

. . ,N) 

(30) 

(1 = 

1 .. 

• ,N) 

(31) 


(32) 

(33) 

(34) 


ic 

quant i t 

ies 

T .: 





1 

N 


+ 

T— 

+ 

of 

II 

A.* 
- 1 

[( 

<M.) r L / . , + M. r { 

j - i + 1 j “i(i+i> i — 

T. 

= T. + 

r L . 

- x 

d. + L. + [ M. r c . + 

” 1 

-1+1 

“Kr 

►1) 

^1+1 “1 1 “11 

L. 

“i 

ii 

IO 

• 

% 

1 

' 2 

(K,.:U) U)> v 


(i = N-l 1) 

(36) 

. 1 xu. 


—iO+1) —i 


(i = N-l 1) 

(37) 

(i = 1 N) 

(38) 


where 


and a subscript v next to a dyadic in equation (38) denotes the vector of the 
dyadic, which is formed by summing the cross products of the prefactors and 
the postfactors of all the dyads in the dyadic. The reason for using the 
expression in equation (38) is to reduce computation counts. In fact, L can 

be expressed in a form identical to equation (20). Conversely, equation (20) 
can be replaced by 

i; = { a,.* ii r 2 LP> V ci = i nj 09 ) 

where the expression in brackets [ ] is the dyadic whose representation in a 
particular unit vector basis is the inertia matrix with half its trace 
subtracted from each diagonal element. The dyadic operations used above follow 
the convention introduced by Gibbs[10] in late eighteen hundreds. 


The starting values for the recursive equations, equations (36) and (37), 


are 


4. = M n V Ei 


KIN 

U = m n ( v E e/ V + 4, 


(40) 

(41) 
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Equations (14) represent the equations of motion of a direct drive 
open-chain system. The modifications to equations (14) for an open-chain 
system with motors and gear reduction mechanisms shown in Fig. 2 are based on 
the difference between generalized inertia forces contributing to these two 
systems. They are[7] 

(F r } s = ‘Os'* G r (r = 1, ...» N) (42) 

where subscript s represents a manipulator system S which haLS a motor and gear 
reduction mechanism in each link similar to that shown in Fig. 2, while 
subscript s’ represents the manipulator system S’ which has the saune mass and 
inertia distribution as system S, but the motor rotors and gear reduction 
mechanisms are considered to be fixed in the links on which they are mounted. 
Here and throughout this paper, the motor driving i-th Joint is assumed to be 
mounted on link B. (i=l N). System S* , therefore, represents a direct- 

drive system. The difference terms between these two systems are 


G = 
r 


-p J (p q + a • e ) 

■ r r * r — P “ 1 


( r > i ) 

( r = i ) 


r r r *r 

N 

•2 “P.J.[(z • e.)q. - q.(a>. ,x z )• e,] 

i - r + 1 i — r — i M ^ ) -i-1 — r “i 

With the additional terms added, the dynamic equations for S become 

N 


A . q. = (i T - T + G 

1-1 r l i r r r r 


( r < i ) 
S bec< 

(r = 1, . . . ,N) 


where 


A = A' . + 
r l r i 


u.J.(z *e.) 

*i i — r —i 


(if r M ) 
(if r = i ) 


G = - p J a 9 *e + T p.J.q.(<*>. x z )*e. 

r r r -r-1 ~ r i^r+1 1^1 -i-1 — r — i 


(43) 


(44) 

(45) 

(46) 


It can be noticed that matrix [ A^ . ] is still symmetric. Hence, only those 
difference terms in the upper triangle of matrix [A^.] need to be evaluated. 


Symbolic Manipulation 

Direct numerical approach in evaluating the above equations can be 
inefficient if there are terms involving multiplication with 0 or 1, or 
addition with 0. The use of mult i -dimensional arrays in a general purpose 
prograun further reduces the computational efficiency. These are some of the 
reasons why a general simulation program cannot achieve the highest possible 
efficiency. Symbolic language such ats MACSYMA can be applied to eliminate 
these inefficiencies. Theoretically, one can use MACSYMA to derive equations 
explicitly in terms of all joint angles and their derivatives and then to 
reorganize the equations for efficient computation. But for a manipulator with 
high number of degrees of freedom, this requires enormous memory space and CPU 
time, and cannot be optimally simplified because the simplification is limited 
by the capability of MACSYMA. It has been found that the recursive formulation 
8 ls presented above is pant icularly advantageous because computation is already 
optimized. Only simple symbolic operations need to be applied to generate the 
recursive equations of motion in FORTRAN coding, and hence the computer time 
required for this process is not excessively long. 

Program Validation 

Complete validation of a simulation prograun is next to impossible. But 
without being subject to some forms of validation, a program cannot be 
trusted. When a program is applied on a reasonably complicated system, one 
cannot rely on simple statements like “the results madce sense" as validation. 
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For a manipulator system with 3 or more links, intuition as to its motion when 
subject to certain inputs does not work well at all. A more systematic 
approach need to be adopted. This is a very important subject for the dynamic 
simulation of robots, but it does not seem to attract much attention. The 
approach taken by the authors was to select some conditions under which the 
system will have some scalar quantities, such as energy or measure numbers of 
am angular momentum vector, that are conserved throughout the motion. Since 
there exists a slight possibility that chance may cause errors not to be 
detected, an independently developed program is used to evaluate the 
validation quantities. Formulation of the system kinetic energy, potential 
energy and the system angular momentum about a base-fixed point on the first 
joint axis for validation purposes will be described next followed by the 
description of test cases chosen. 


The kinetic energy formulation for a direct-drive manipulator S’ is 

K’ = l A A A., q.q. (47) 

2 i - 1 j - 1 1 j i j 

For a manipulator with gear reduction in its drive system, slight modification 
is required. Consider two manipulator system S and S as described before. If 
they have the same motion, then the kinetic energies K and K of systems S and 
S’, respectively, are related by[7] 

K = K* + .|l ( 2 + ^ (48) 

The potential energy V of the system are due to gravity only, and it is 

v = M, G • r‘, (49) 

There is no difference between the potential energy expressions for S and S 
because they have the same mass distribution. The angular momentum vector H’ 

of a direct-drive manipulator S’ about Q 1 is 

H’ = A § rj x (m p Y) (50) 

J j 

where P is a generic particle in B., £ represent the summing over all the 


rn and 
P 


E V P are, 


particles in B., r p is the position vector from Q 1 to P, and 

respectively, the mass and the velocity in E of P. Only the measure number of 
H* in z direction is used in the validation process. Comparing the kinetic 


energy formulation with that of H’ , one can obtain 


H’ 

where A 


M “ i?1 


1 i 


q ; 


(51) 


1 i 


cam be found in Eq. 


about Q of systems S and S' , 
H = H* + p.J.q.e. 

- i¥l *1 i i 


(15) with r = 1 . The angular momenta H and H’ 
respectively, are related by 


(52) 


Hence, equations (48-52) provide the necessary formulas for the evaluation of 
the conservation quantities. 


The validation cases used are 
I. Conservation of total energy, K+V : 


T =0 (r = 1 N) 

r 

II. Conservation of H • z.: 

a. g = 0, T 1 = 0, p 1 = 1, e 1 = z 1 

b. g * 0, = 0, ^ = 1, e, = z, 

where g is the gravitational constant. 


= ± £/g 
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Under conditions described above, many simulation runs of manipulators 
have been performed to validate the program. The results of these runs show 
that numerical variations of the conservation quantities are of the order of 
magnitude that is appropriately correlated to the absolute integration error 
tolerances. Here integration subroutines using Adams-Bashforth-Moulton 
predictor-corrector scheme and using Runge-Kutta-Verner method have been 
separately applied for numerical integration. 

Discussions 

For a general manipulator, the total numbers of operations to obtain A 

and M r T r - T r (r,i = 1,...,N) in Eq. (14) for a direct drive manipulator are 

11N(N-1 )+150(N-l )-15 multiplications and 7N(N-1 )+l 19(N-1 )-14 additions. 

Table 1 lists the numbers of operations required for each of the equations in 

the evaluation of matrix [ A ^ . ] and T . The counting of operations follows that 

presented in [6]. Notice that unit vector basis transformation has to be 
performed in each recursion step. Here external force and torque applied on 
the last link are not included. Therefore, for a general 6 link manipulator, 
1075 multiplications and 791 additions are required. This is much lower than 
the 1541 multiplications and 1196 additions needed in Method 3 of [2]. For the 
six dof PUMA 600 presented in [9], Our MSP program generates FORTRAN code that 
requires 351 multiplications and 281 additions to perform the computation that 
takes 392 multiplications and 294 additions in [8]. 

Table 1. Number of Operations 


equations 

mul t ipl i cat ions 

additions 

(22) 

57 ( N-l ) -33 

361N-1J-25 

(24) 

8(N-1 ) 

7(N-1 )-3 

(27) 

1 1N(N-1 )-8(N-l ) 

7N(N-1 )-4(N-l ) 

(28) 

8 ( N- 1 ) 

5(N-1) 

(29) 

10(N-1) 

6(N-1) 

(30) 

17 ( N— 1 ) 

13(N-1) 

(31) 

6(N-1 ) 

9(N-1 )+l 

(36) 

17(N-1 ) 

13(N-1 ) 

(37) 

20(N-1 ) 

19(N-1 ) 

(38) 

15(N-1 )+3 

15(N-1 )+l 

(40) 

9 

6 

(41) 

6 

5 

Total 

1 1N( N-l )+150(N-l )-15 

7N(N-1 )+119(N-l )-14 


Adding q.z. to the right hand side of equation (29), equations (28-41) 

together with equation (17) become inverse dynamic formulation for a direct 
drive robot. This inverse dynamic evaluation is similar to algorithm 3 in (6]. 
By applying MACSYMA to the formula, some unnecessary computations can be 
removed. For instance, if N is 6, the number of computation for the 
manipulator with twist angles equal to 0° or 90° is 340 multiplications and 
290 additions compared to 388 multiplications and 370 additions in [6]. For 
the simpler manipulator with r®. and rj having only one nonzero element, 

the numbers of computation are 245 multiplications and 204 additions compared 
to 277 multiplications and 255 additions in [6]. The reductions are due to 
some additional multiplications and additions with zero quantities that are 
counted in 16] because the authors of [6] did not actually expand the 
equations for the counting. 

In order to give additional indication of efficiency, the 7 link Robot 
Research Corporation [11] manipulator shown in Fig. 3 with parameters listed 
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in the Appendix is considered. The numbers of operations for the system in the 
form of Eq. (14) are 651 multiplications and 505 additions with effects of 
motor rotors included. The numbers for direct-drive system are 548 
multiplications and 439 additions. Therefore, adding the effects of motor 
rotors requires 103 multiplications and 66 additions, which is about 17% of 
the total computation needed for the direct-drive system. 

In the interest of demonstrating the effect of motor rotors, a constant 
motor torque (T^ 0.625 N-m) is applied on the first joint of the 7 link 

manipulator shown in Fig. 3. Two sets of equations are solved for comparison. 
Set 1 is equation (14), which represents the direct drive system S' and set 2 
is equation (44), which is the complete equations of system S. The results 
from set 1 are shown in Figs. 4 and 5 while those from set 2 are in Figs. 6 
and 7. The differences of the results are so substantial that it is clear that 
set 1 is not representative of the actual system. 

Among all the additional terms due to motor rotor, the terms p 2 J.q. 

(i=l,..., N) are most significant due to the large values of p.. Another set 

2 1 

of equations, set 3, established by adding only p.J. to diagonal elements 

A of [A* ] matrix in equations (14), is also solved for comparison. A 

sinusoidal motor torque (T^ 3.125 cos 0.8ITt kg-m) is applied on the first 

joint of the manipulator. Some of the results from set 2 are shown in Fig. 8 
while the differences of the results between set 2 and set 3 are shown in 
Fig. 9. It is clear that the differences are relatively small in the duration 
of 5 seconds. 



Fig. 3. A 7 Link Manipulator 
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Conclusions 


The initial development of an efficient simulation program for 
telerobotic manipulators is described. An efficient recursive formulation of 
Kane’s dynamic equations for a class of manipulators which are made up of 
links connected with revolute joints and driven by motors with reduction 
mechanisms is presented. The recursions are established according to the 
summation bounds in the equations. Comparison of operation counts with other 
published formalisms shows advantages of the present approach. Furthermore, 
effects of rotor inertia and speed reduction are included in the formulation 
to yield a more faithful model of the actual system. Symbolic manipulation is 
also applied to generate customized simulation program for additional 
improvement of the computational efficiency. Aside from the discussions on 
efficiency, steps taken to validate the simulation program are presented. 
Finally, simulation results show that effects of rotors in drive system with 
high speed reduction cannot be ignored in the simulation of a manipulator. 
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Fig. 4. Solution of Equations Set 1 
(joints 1—3) 



0.25 0.5 0.75 1 


time (seconds) 


Fig. 5. Solution of Equations Set 1 
(joints 4 — 7) 


















Appendix 

The parameters of a 7 link Robot Research Corporation manipulator used in the 
simulation from which results are presented in this paper are in the 
following. 

Mass (including motor and speed reduction mechanism): (kg) 



1 

2 

3 

9 

5 

6 

7 


104.2 

55.4 

2578 

20.4 

11. 1 

4.7 

4.7 


2 

Moment of inertia (including motor and speed reduction mechanism) : (kg-m ) 



I X X 

Iyy 

JTi 

link 1 

5705 

2.05 

0.7 

link 2 

0.4 

0.4 

0.3 

link 3 

0.6 

0.6 

0.2 

link 4 

0.2 

0.2 

0.2 

link 5 

0. 1 

0. 1 

0.04 

link 6 

0.03 

0.03 

0.03 

link 7 

0.03 

0.03 

0.03 


Speed reduction ratio, rotor inertia (including attachment) and rotor axis (in 
link fixed unit vector basis): 



speed ratio 

rotor inertia 
(kg-m z ) 

rotor axis 

base 

155 

0,003 

l 0 , 0 , 1 J 

link 1 

160 

0.003 

[ 1 , 0 , 1 ] 

link 2 

200 

0.0003 

[ 0 , 1. 0 ] 

link 3 

200 

0.0003 

[ 1 . o, 0 ] 

link 4 

200 

0.0002 

[ 0 . 1 , 0 ] 

link 5 

200 

0.0002 

[ 1 , o, 0 ] 

link 6 

160 

0.0002 

[ 0, 1 , 0 ] 


Joint to mass center position vector (m) r ® . , i-1, . . . , 7 : 



7 Z 

1 lnk’T 
link 2 
link 3 
link 4 
link 5 
link 6 
link 7 

- 0 0 5“ 

0 0 0 

0 0 0 

0 0 0 

0 0 0 

0 0 0 

0 0 0 


Joint to joint position vector (m) r j (i+1) > i-1.---. 6 : 



X 7 Z 

link'! 
link 2 
link 3 
link 4 
link 5 
link 6 

575 575 575 

0.3 0.1 0.0 

0.1 0.0 0.3 

0.3 -0.1 0.0 

0.08 0.0 0.4 

0.19 -0.08 0.0 
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Abstract 


In this paper a nonlinear Lagrangian formulation for the spatial 
kinematic and dynamic analysis of open chain deformable links consisting 
of cylindrical joints that connects pair of flexible links is developed. The 
special cases of revolute or prismatic joint can also be obtained form the 
kinematic equations. The kinematic equations are described using 4x4 
matrix method. The configuration of each deformable link in the open loop 
kinematic chain is identified using a coupled set of relative joint variables, 
costant geometric parameters, and elastic coordinates. The elastic 
coordinates define the link deformation with respect to a selected joint 
coordinate system that is consistent with the kinematic constraints on the 
boundary of the deformable link. These coordinates can be introduced 
using approximation techniques such as Rayleigh-Ritz method, finite 
element technique or any other desired approach. The large relative 
motion between neighboring links are defined by a set of joint coordinates 
which describes the large relative translational and rotational motion 
between two neighboring joint coordinate systems. The origin of these 
coordinate systems are rigidly attached to the neighboring links at the joint 
definition points along the axis of motion. The geometry of the deformable 
links are included in the formulation by two costant parameters which 
accounts for the length and twist of the deformable link in tis undeformed 
state. The kinematic equations that define the global position and velocity 
of an arbitrary point on a deformable link is developed in terms of the 
relative joint variable, constant geometric parameters, and elastic 
coordinates of deformable links. These kinematic equations are then used 
to develop the energy expression of the deformable link. The nonlinear 
terms that represent the dynamic coupling between the large relative 
motion and the mall elastic deformations is identified and presented in 
terms of a set of time-invariant quantities that depend on the assumed 
displacement field and provide a systematic approach to study the spatial 
dynamics of open loop kinematic chains. The system differential equations 
are then developed and expressed in terms of these set of invariant 
quantities using Lagrange's equation of motion. 
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Abstract 

The state of the art dynamic response analysis of flexible multibody 
systems is currently restricted to elastic bodies with homogeneous 
materials. The requirements for high speed operation has made it 
necessary to use lightweight multi layered composite bodies in robotic 
systems and space structure applications. Dynamic modeling and analysis 
of such systems are particularly important since the effects of body 
flexibility to the performance are likely to be more pronounced. 

In this paper first the eight-noded isoperimetric quadrilateral element 
with independent rotational and displacement degrees of freedom is 
extended to laminated composite elements. The element includes an 
arbitrary number of bonded layers, each of which may have a different 
thickness. The transverse shear deformation which is a predominant factor 
in the analysis of laminated composite structures is taken into account in 
developing the stiffness and mass matrices. The corresponding 3-D mode 
shapes are then incorporated to the multibody system dynamical equations. 
Floating body reference frames allow the selection of different boundary 
conditions, and the dynamical equations contain all the nonlinear 
interactions between the rigid and elastic motion. Example simulations are 
presented to illustrate the methods proposed. 


* Assistant Professor, Member ASME, AIAA 
’PhD, Member ASME 
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ABSTRACT 

The dynamics model here is a backhoe, which is a four degree of freedom 
manipulator from the dynamics standpoint. Two types of experiment are chosen that can 
also be simulated by a multibody dynamics simulation program. In the experiment, 
recorded were the configuration and force histories; that is, velocity and position, and force 
output and differential pressure change from the hydraulic cylinder, in the time domain. 

When the experimental force history is used as driving force in the simulation 
model, the forward dynamics simulation produces a corresponding configuration history. 
Then, the experimental configuration history is used in the inverse dynamics analysis to 
generate a corresponding force history. Therefore, two sets of configuration and force 
histories-one set from experiment, and the other from the simulation that is driven forward 
and backward with the experimental data-are compared in the time domain. More 
comparisons are made in regard to the effects of initial conditions, friction and viscous 
damping. 

INTRODUCTION 

With recent developments in dynamic simulation software, there have been steady 
improvements in analysis and design of multibody mechanical systems. The performance 
of a software package has been frequently compared with that of another, but rarely with 
experimental data. In this research, dynamic simulation is compared with experimental data 
in time domain. 

Through the dynamic simulation, the rigid-body (or flexible-body) equations of 
motion generate the positions, velocities and accelerations of the components of a given 
system, and the reaction forces at the system’s joints. The equations of motion are usually 
idealized by not including Coulomb friction and viscous damping, and by simplifying 
actuating force elements. These idealizations manifest the limitations in the mathematical 
modelling of a dynamic system. There are also limitations in experimentation. The 
experiments provide the factual data of the actual system. But such data are not completely 
reliable because of errors in measurements and subsequent data analysis and interpretation. 

The system chosen for the research is J.I. Case 580K backhoe. From a dynamics’ 
standpoint, it is a manipulator of four degrees of freedom (dof) with an operator in the 
loop. Three dofs are controlled by hand levers for digging, scooping up, and dumping 
operations, and one dof by a pair of foot pedals for left and right swing motion. These 
four dofs are individually controlled by hydraulic cylinders that comprise a complicated 
circuit 

APPROACHES 

The approach taken here was to divide the simulation task into multibody dynamics 
and control element modeling, each of which was separately validated and then later 
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combined together. In this paper, only the validation of the multibody dynamics is 
presented The multibody dynamics includes the modeling of each component and its 
joints, with the assumption that applied forces or torques are supplied by the control 
elements. The validation of will thus enable unbiased evaluation of the multibody dynamic 
simulation, without being influenced by the modeling technique of the controller, i.e., in 
this case, the hydraulic cylinders and circuitry. 

The verification effort started with defining a set of static and dynamic quantities 
that were both measurable in the experiment and obtainable from the simulation, and that 
were capable of describing the system status at any specified time. Such quantities were 
identified as positions and velocities of the system components, and forces acting on the 
system’s joints. In the simulation, the post-processing analysis recovered these quantities 
easily. In the experiment, however, each quantity demands its own transducer with signal 
conditioning and data analysis. As a result, experiments were carefully orchestrated with 
the available equipments, so that the mathematical model could simulate the same operation 
as in the experiment. 

Although there is no established method of validating dynamic simulation in the 
time domain, the strategy adopted here makes use of the forward and backward (inverse) 
dynamic analyses, with experimentally known time histories of position and joint forces. 
The position history that had been measured in the experiment was input into inverse 
dynamic analysis, which generates a force history that would have driven the simulation 
model along the input position history. Under the ideal condition such that the dynamic 
simulation describes the exactly same behavior of the actual system, the two force histones 
— one from the experiment, the other from the inverse dynamic analysis — should be the 

same. But, in reality, there inevitably exists a discrepancy between these two. This 

discrepancy is viewed as a measure of the validation. Similarly, the force history that had 
been obtained in the experiment was fed into forward dynamic analysis, which generates 
position history that would have been exactly the same as measured under the ideal 
condition. Again this position history was compared with the experimental position 
history. 


EXPERIMENTS 

Since the boom carries most of the load, its static and dynamic stress analysis is the 
major concern in design and analysis. Once the dynamic model is validated, it should 
generate reliable joint reaction forces for dynamics and stress analysis. The experimental 
effort was thus concentrated on the boom and its hydraulic cylinder. The transducers were 
attached to boom are a load cell that measures the boom cylinder force output, a differential 
pressure transducer between the supply and dram sides of the boom cylinder, and a 
position/velocity transducer for the boom cylinder piston movement. The experiments 
were conducted by actuating the boom cylinder with various fixed configurations of the 
dipper and bucket assembly. Among those various configurations, two of them were 
selected for experimentation and simulation. First, the bucket and the dipper were tucked 
in under the boom, as shown in Fig. 1 . Second, the bucket and the dipper were stretched 
out, as shown in Fig.3 . 

Experimental data were digitized, inspected, and recorded in the IBM PC/AT at the 
experimental site. Later in the lab, the PC was connected to the local network to unload the 
data to an Apollo workstation. The data were then retrieved, filtered, interpreted, and 
supplied for comparisons of experimental with theoretical results in static stress analysis 
and dynamic behavior, and verification of hydraulic actuator models. 

Two types of experimental data are used in the dynamic simulations: force and 
relative displacement histories of the boom cylinder for the forward and backward 
simulation. Both quantities were measured while the backhoe was being operated through 
a predefined trajectory. The relative displacement was measured by a position transducer 
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with one end attached to the piston and the other end to the cylinder housing. The cylinder 
force was measured by a load cell placed at the piston end of the cylinder. 

Experiment I 

In the first experiment, the backhoe is in folded-up configuration; that is, the dipper 
and bucket cylinders are fully extended, so the dipper and bucket are tucked in under the 
boom. The angle between the boom and the dipper is about 42 degrees. The only degree of 
freedom allowed is the rotation around the revolute joint between the boom and the swing 
tower. At the beginning of an experiment, the boom was in upright position, making an 
angle of 4 degrees with the vertical (Fig. 1). The boom was slowly lowered from the 
upright position until it reached 38 degrees of boom angle, then stood still a few seconds, 
and was brought back up to the original position. The duration of this operation was about 
30 seconds. 

A typical relative cylinder displacement history measured by the position transducer 
is shown in Fig. 2. Since the boom cylinder extends while the boom drops downward, the 
rising trend of the displacement history should be interpreted as the downward motion of 
the boom. 
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The motion of the backhoe can be divided into three stages 
Figure 2 Position history of experiment I 


Experiment II 

In the second experiment, the dipper and bucket cylinders were fully retracted so 
that the backhoe stretched out to its longest reach (Fig. 3). The bucket initially rested on 
the ground. The boom slowly lifted the bucket up until the bucket reached about 2 m above 
the ground. Then it brought the bucket back to its original position. This time the whole 
operation took about 8 seconds. Figure 4 shows a typical relative cylinder displacement 
history measured in experiment II. 



Figure 3 Initial configuration of backhoe in experiment II 
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DYNAMIC SIMULATION 


Modelling 

The dynamic modelling of the backhoe started with a relatively simple model 
including only the major components, and then added more components such as pins until 
every single component was accounted for. Table 1 lists the major components and the 
types of joint used in the model (also see Fig. 6). One of the most significant changes 
made in model refinements is the addition of the weights of pins and hydraulic fluid. These 
masses have been regarded insignificant until we found that the simulation model was 
lacking in the total inertia. 



these two active joints are modelled 
as frictionless and no damping 


this is the only joint modelled 
considering friction and damping 

Figure 5 Active joints of backhoe 


Among the joints, three of them are active during the experimentation, the locations 
of these active joints are shown in Fig. 5. There evidently exist viscous and Coulomb 
friction damping forces at these joints. Since the revolute joint between the tower and the 
boom is the biggest joint among them, the complexity of the analysis is reduced by 
modelling that joint as the only joint with viscous and friction damping. 
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Table 1 Bodies and Joint Types 
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Figure 6 Joint Definition 


Tower 


Initial configurations 

In the time-domain validation, the initial conditions of the simulation must first and 
foremost equal those of the experiment. This requires static equilibrium analysis in the 
simulation and accurate measurements of position and reaction forces in the experiment. In 
fact, some initial configurations had to be excluded on the grounds that they were statically 
indeterminate. 

At the beginning of experiment I when the system is in static equilibrium, the force 
measured at that moment was used to calculate the exact initial position of the backhoe in 
the experiment. The position measurements were also available from the experiment, but 
were much less accurate than the force measurement, because a little error in position 
measurement resulted in a huge error in the corresponding equilibrium force at the initial 
configuration, which was almost vertical. 

In experiment n, a different approach was taken to determine the initial 
configuration. Since the initial position of the tip of bucket was precisely known, the 
initial configuration was determined by using this fact The mass center of the swing tower 
is defined as the reference coordinate center of a simulation, so the vertical distance from 
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the reference center to the ground has to be measured. This distance was measured to be 
about 0.8m. The initial configuration of experiment II is thus obtained based this 
information and the kinematic relations between bodies of the backhoe model. 

SIMULATION AND COMPARISON 

In comparison with experiment I, several viscous damping ratios have been tested 
in the dynamic simulations. Figure 7 shows force comparison in which viscous damping 
does not play a significant role. Indeed, it was a slow operation, so the viscous damping 
force was expected to be small. However, the effect of viscous damping is pronouncedly 
exhibited in the displacement comparison. In Fig. 8, the viscous damping coefficients of 
10 and 15 (kN/m/sec) make the simulation close to the experimental data. 

In Fig. 9, the simulation with the viscous damping coefficient of 15 (kN/m/sec) 
continues to move upward (actual motion downward) even when the actual system stopped 
and stood still, thus exposing the absence of Coulomb friction in the simulation model. 

The existence of Coulomb friction is also observed in Fig. 7. The force from the 
simulation is not reduced by the amount of Coulomb friction force, whereas the applied 
force has already reflected loss from Coulomb friction. Therefore, the simulation force 
would be equal to the sum of the Coulomb friction and applied forces if the simulation 
exactly matched with the experiment. In the first half where the friction force is in the same 
direction with the applied force, the simulation force appears above the actual applied force. 
In the second half where the friction force is in the opposite direction to the applied force, 
the simulation force appears below the actual applied force. 

In experiment II, the long stretch of the backhoe in combination with a faster 
maneuver induced vibrations that are visible in Fig. 10. But the simulation shows no 
vibration but follows the general trend, because the system is modelled with rigid body 
dynamics. Figure 1 1 shows a good agreement between the simulation and experiment in 
position history. 


DISCUSSION 

When the experimental position history was input to inverse dynamic analysis, it 
was differentiated twice to obtain velocity and acceleration. Along with this digitized 
position history, however, noises and discontinuities were also differentiated twice, 
thereby creating quite a few “jerks”, which in turn made the simulation force fluctuate 
spuriously. To correct this problem, three smooth curves of first and third order 
polynomials were pieced together to approximate the experimental position history. At the 
two junction points, spurious peaks are still observed in Fig. 7 and 10. 

Experimental estimation of viscous and Coulomb friction damping should 
accompany the analytical effort in which several dynamic simulations were performed with 
different damping coefficients. These damping forces were not so significant in Fig. 7. 
But their effect on the displacement is quite noticeable as shown in Fig. 8. 

The validation in the time domain requires that the initial condition of the simulation 
should equal that of the experiment This requirement is most of times very difficult to 
satisfy, because it involves static equilibrium analysis in the simulation and accurate 
measurements of position and reaction forces in the experiment 
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FIG. 7 EXPERIMENT I: FORCE COMPARISON 











FIG. 8 EXPERIMENT!: POSITION COMPARISON 
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FIG. 9 EXPERIMENT!: POSITION COMPARISON 




FIG. 10 EXPERIMENT II: FORCE COMPARISON 
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FIG. 11 EXPERIMENT H: POSITION COMPARISON 
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ABSTRACT 

The dynamics of vibrations in flexible structures can be conveniently modeled in terms of 
frequency response models. For st ruct ural control such models capture the distributed param- 
eter dynamics of the elastic structural response as an irrational transfer function. For most, 
flexible structures arising in aerospace applications the irrational transfer functions which 
arise are of a special class of pseudo-meromorphic functions which have only a finite num- 
ber of right half plane poles. In this paper, we demonstrate computational algorithms for 
design of multiloop control laws for such models based on optimal Wiener-Hopf control of 
the frequency responses. The algorithms employ a sampled-data representation of irrational 
transfer functions which is novel and particularly attractive for numerical computation. One 
key algorithm for the solution of the optimal control problem is the spectral factorization of 
an irrational transfer function. We highlight the basis for the spectral factorization algorithm 
together with associated computational issues arising in optimal regulator design. We also 
highlight options for implementation of wide band vibration control for flexible structures 
based on the sampled-data frequency response models. A simple flexible structure control 
example is considered to demonstrate the combined frequency response modeling and control 
algorithms. 

1 Introduction 

Frequency response methods offer several advantages for modeling the dynamics of small 
amplitude vibrations in flexible structures. Such models capture the distributed parameter 
dynamics of the elastic structural response as an irrational transfer function from localized 
actuation to localized deformation measurements. Interest in frequency response models can 
arise from a desire to predict modal frequencies with increased accuracy over that obtainable 
from finite element methods. The frequency domain approach is well suited to optimal control 
law synthesis with specific requirements for precision vibration supression and isolation. Most 
computational methods for optimal control synthesis available to design engineers focus on 
the manipulation of state space models. For flexible structure control, state space models 
are problematic since the question of model order required must be resolved as part of the 
optimal control computation. This paper reports progress in the development and testing of 
computational methods for design of precision control systems for mechanical structures with 

1 Work supported by SDIO and Air Force Wright Research and Development Center under contract F33615- 
88-C-3215. 
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C + 

open right half complex s-plane, 9?e s > 0 

Hoc 

Hardy space of complex functions, analytic and essentially 
bounded in C + 

h 2 

Hardy space of complex functions, f(s), analytic in C + and such 
lh *‘; Sjl/'r- l/WII* *] 1/: < OC for a € C + . 


rational functions in 


Table 1.1: Notation 


elastic effects based on direct frequency response models. The frequency response models 
can arise from finite element analysis, transfer function methods, wave propagation models, 
and/or empirical measurements. Moreover, the computational approach offers a framework for 
integration of frequency response data from various modeling approachs which offer varying 
precision in different frequency bands. The current paper extends the efforts reported in [1], 

We will use the following notation and conventions in this paper. The transpose of a 
column vector will be denoted as x T , Tr X is the trace of the square matrix X, and j = 
y/— T. A Laplace (resp. z) transform will normally be indicated by dependent variable; a-(s) 
(resp. *(z)), however, we often drop the explicit dependence where the meaning is clear 
from the context. The notation u,(s) = tt r (-s) will be frequently used. £{x(t)} indicates 
the expectation of the random process x(t). In this work all random processes are assumed 
wide sense stationary and ergodic so that expectation can be replaced with ensemble average 
where convenient. The notation contained in Table 1.1 specifies the classes of transfer function 
models considered at various points. A rational function has a (partial fraction) expansion 
A(s) = {A(-*)}+ + {A(s)}_ 4- {^(s)}^ where {.} + (resp. {.}_) is analytic in 9?es > 0-the 
causal part (9?es < 0-the anti-causal part) and {.}<» is the part associated with poles at 
infinity. Thus the operation {A(s)} + is causal projection of the frequency response model. 

In section 2 we provide an overview of frequency domain models and modern Wiener-Hopf 
design of multiloop control systems. We motivate the role of frequency response modeling and 
optimal control and identify critical computational steps required for the method. Section 3 
discusses a new approach to the required computations for Wiener-Hopf control which extend 
the algebraic constructions for rational transfer functions to certain irrational cases. We 
highlight the role of coprime factorization in design of distributed systems. Section 4 considers 
a simple, but nontrivial distributed parameter system design. Finally, in section 5 we discuss 
new options for real time control implementation suggested by the computational approach 
of section 3. 

2 Optimal Control of Frequency Response Models: Wiener-Hopf 
Design 

Frequency domain models have been used to articulate the full range of opportunities for 
feedback compensation for internal model stabilization. Algebraic constructions based on 
Laplace transform models of linear, time-invariant system dynamics have been used to describe 
alternatives for standard control computations and realizations for stabilizing controllers [2, 
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3], This together with Wiener-Hopf optimization provides a general approach for resolving 
tradeoffs in regulator design where natural, frequency domain specifications for model-based, 
control performance are available. We remark that restricting attention to rational transfer 
function models in computational approachs to flexible structure control has been primarily 
motivated by convenience [1]. Specific results which extend the constructions of coprime 
factorization and internal stabilization to a certain class of irrational transfer functions have 
been obtained [4], In the present effort we restrict attention to transfer functions in H^and 
meroinorphic with the exception of a small number of right half plane poles. 


Optimal regulator design via Wiener-Hopf methods. Techniques for the solution of 
H 2 optimization problems in multiloop feedback systems have received considerable attention 
in the control theory literature for a number of years. A comprehensive approach to Wiener- 
Hopf design using transfer function models is given by Youla et al [5]. 

A general framework for resolution of tradeoffs in multiloop control design was recently 
outlined by Park and Bongiorno [6]. In general, control design involves the resolution of 
choices in the use of dynamic (feedback) compensation with respect to a nominal dynamic 
model of the system response to an n- vector control, u and an m-vector of exogenous system 
disturbances, e, as seen by p available sensors, y and l (possibly nonmeasurable) regulated 
variables. A frequency domain model for the control design problem (shown in Figure 2.1) 
can be expressed using Laplace transforms as, 



The control architecture is assumed to involve feedback, 

ti(s) = C(3)j/(s). (2.2) 

Then the closed loop compensation will alter the response of the system to disturbances as 
seen in terms of the regulated variables as 2 , 


2 = [G ZU CSJ] 


G 

G z 


v* 


= G zu CSG yt e + G 


(2.3) 


where the system closed loop sensitivity operator is S = [I + GyuC\~ l . 

A major consideration in the classical methods of frequency domain design is closed loop 
stability. In such methods stability considerations must be continually evaluated (using root 
locus or Nyquist plots) as performance tradeoffs are evaluated. For single loop designs of rel- 
atively low order systems, classical frequency domain methods focus attention on the tradeoff 
between stability margins and performance. The modern approach is to use optimization to 

^Suppressing dependence on the Laplace variable s. 
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Figure 2.1: General Multiloop Control System Design Problem 

resolve complicated engineering tradeoffs in multiloop design — subject to the constraint of 
system interned stability. 

The well known Youla parametrization of all stabilizing feedback controllers C which 
stabilize a given plant model was orginally derived for rational matrix transfer functions 
which have coprime factorizations over the ring of polynomials in the Laplace s- variable [5], 
It is now understood that the construction goes through without modification for the ring of 
stable rational functions, RH*, which includes all rational transfer functions analytic in the 
closed right half plane including the point at infinity [7]. Thus under the assumption that the 
plant transfer function has right and left factorizations, 

Gyu = ND- 1 = Df'Nt, (2.4) 

coprime over RH,*, then there exist A", Y, X t , Yt £ RH n such that 

DtXt + N t Yt = /, XD + YN = /, (2.5) 

and each controller, C, which obtains R = CS with R £ RH*, can be parametrized by the 
factorization formulae, 

C = ( X-KN t )-\Y + KD t ) (2.6) 

= (Y t + KD)(Xt - KN)- 1 (2.7) 

for some K £ RH,*,. The importance of this construction is that optimization procedures can 
be applied directly to the choice of K £ RH*, without concern for closed loop stability. This 
fact was first exploited by Youla et al[5] in the description of Wiener-Hopf optimal control for 
frequency response models. More recently, the parametrization has been utilized by Desoer 
and his students [8]. 

Our concern here is with H 2 optimization for a certain class of irrational transfer func- 
tions which are “pseudo- meromorphic” in the sense stable coprime factorizations exist; i.e., 
the constructions in (2.4)-(2.7) obtain closed loop stability with N, D, N ( , D(, X, Y, Xt, Yt, I< £ 
Hoofl, 4]. In the research reported herein we avoid algebraic constructions related to stable, 
coprime factorization of irrational transfer functions as considered by Desoer [7] and instead, 
focus on the development of numerical algorthims for approximating the frequency response 
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of the required objects. Such transfer functions can be adequately approximated over fi- 
nite frequency ranges by (rather high order) rational transfer functions. Models of this type 
arise in the study of vibrations of multibody systems with flexible interactions [9] and wave 
propagation in flexible mechanical structures [10, 11]. 

PSD modeling of control processes for performance specification. A wide class of 
standard control design problems including symultaneous requirements for tracking, distur- 
bance rejection and accomodation, etc. can be represented in the form of the general linear, 
time-invariant regulator problem of Fig. 2.1 and (2.1)-(2.2) where the objective is to choose 
a controller which stabilizes the closed loop system and minimizes a performance criterion in 
the form, 

J = ^Tj J’Z Tr[Qi ’ )PM] ds ' (2 ' 8) 

where P z is an effective Power Spectral Density (PSD) of the regulated variables, z. 

The representation of regulation performance in terms of PSD is quite practical for a va- 
riety of design problems arising in multiloop systems and provides a frequency dependent 
specification of control performance consistent with design requirements for vibration rejec- 
tion. One can extend the significance of PSD modeling to include a wide range of practical 
design considerations. The regulation PSD, P z , can be related to modeling assumptions on 
the exogenous inputs in terms of the closed loop transfer functions; 

P z = [ G UZ R , 7] ^ P t (<3>,, G ze .) . (2.9) 

PSD models of exogenous inputs may include deterministic transient effects together with 
steady state stochastic PSD, $ e ; viz., 

P e (s) = Ai£{e(s)e,(«)} + A 2 $ ee (a), (2.10) 

Formulation of the performance objective, J, may include real, positive, A,-, t = 1,2 which 
permit scaling relative importance of steady state and transient considerations to the com- 
posite performance and £?(s) is included to permit frequency weighting. PSD modeling has 
recently received increased emphasis in the study of vibration control in acoustic regimes [11]. 

Park and Bongiorno [6] also highlight the use of PSD models for minimizing closed loop 
system sensitivity to model uncertainty. Let the system model uncertainty be given as a 
frequency dependent, additive perturbation, G := G+A, which can be expressed in partitioned 
form as, 



Following [6] an effective model uncertainty PSD, Pa = P{AA,}, can be reflected to the 
system regulated outputs, and via superposition, a composite performance objective of the 
form, 

j =^C Tr l QlG - R - *:;]}" 


281 



is obtained for optimal regulator design, where the system model uncertainty can be obtained 
commensurate with the performance specifications as an effective disturbance PSD, 


^yy ^v z 

$ $ 

^ zy ^ zz 



Pe (^yc*i G ze* ) + /<Pa- 


(2.11) 


Here the partitioned terms can be expressed in terms of a priori modeling assumptions; [6] 

$ yy — G y€ P € G ye * + A yu ( A yu )*}2?{A ye ( A yc ).}), (2.12) 

$ yz = G ye P e G ze * + A yu ( A zu )*}i2{ A yc ( A ze )*}) — $zy*i (2.13) 

^ = G ze P'G Z '„+n(E{b zu (bME{A Z '(AM). (2.14) 


It is by now widely recognized that frequency domain response considerations are ex- 
tremely important for robust control design and that performance objectives formulated in 
the frequency domain are important tools for resolving design tradeoffs of relevance to prac- 
tical design problems. However, the common wisdom is that state space modeling offers the 
most reliable numerical framework for the computational problems which arise in optimal 
regulator design. The Wiener-Hopf approach identifies the solution for the optimal controller 
in an explicit form which highlights the role of the algebraic constructions generic to stabiliza- 
tion and the quantitative computations required for identifying an optimal controller. Thus 
given the system architecture (2.1)— (2.2), appropriately chosen stable coprime factors for the 
plant (2.4), a nominal stabilizing controller given in terms of its coprime factors as solutions of 
the Diophantine relations (2.5), and performance PSD’s (2.12)-(2.14), then an optimal closed 
loop system response is obtained (assuming a solution exists) by the formula, 

R = DA - 1 ({Azr'yfO- - {A :'D«G Z u,Q*zyD l M: 1 } + ) n~ l D t . ( 2 . 15 ) 

The explicit form given here depends on operations of causal projection and the solution of 
two causal, spectral factorizations; 

D,G ZU ,QG ZU D = A.A (2.16) 

D&yyDi = DU, (2.17) 

with A, A -1 ,D,f2 _1 6 Hoc. The required controller can then be obtained in the explicit form, 

C = (/ - RG yu )~ l R. 

The computational steps required to identify candidate optimal control solutions for the 
regulator problem include: 1) stable coprime factorization (as in (2.4), 2) identify candidate 
solution to Diophantine relations (2.5), 3) causal spectral factorization, and 4) causal pro- 
jection. We contend that such computations can be effectively supported (in finite precision 
arithmetic) by obtaining state space realizations [3] only for relatively low order, rational trans- 
fer functions. In the sequel, we specifically avoid such an approach since we are ultimately 
concerned with the approximate solution of large (or even infinite) dimensional models. 


3 Frequency Response Computations for Optimal Control. 

Our approach to optimal control computation is motivated by distributed parameter models 
which arise in flexible structure control. The approach we have in mind is based on sampling 
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and interpolation of the frequency response models for the system. The choice of sampling and 
the resulting high order, rational approximations are obtained in the context of the optimal 
control problem as summarized above. 


A computational approach to spectral factorization. Recall that a transfer function 
H(s) E H 2 n Hoc has a unique spectral factorization H{s) = F(s)F*{s) with F € Hoo if: 

1. H\s) = H(s); i.e., H{s) is the transform of a real- valued function h(t). 

2. H(s) = H*(s)] i.e., H(s) is “para-hermittian”. 

3. H{s) is of normal rank; i.e., full rank almost everywhere in C. 

4. H(ioj) is positive, semi-definite and bounded for o> £ R- 

To see that causal projection is a closely related problem consider the following. If H(s) is 
scalar, then with ^(s) = In H(s) we obtain 

$(,) = {^ik + W- ( 31 ) 

= In F(s) + In F*(s), (3-2) 


so that the causal, spectral factorization is related to causal projection via the logarithmic 
transformation; F(s) = exp{ln H(s)} + . 

Our goal is to obtain numerically stable approximations to these related problems for 
transfer functions in Hoc- For application to precision control of flexible structures we require 
wide band frequency domain models so that even rational approximations will be of relatively 
high order. An approach to model order reduction which has recently received attention in the 
literature is based on Fourier series approximation of irrational frequency responses [12] in Ho©. 
Our approach to computations for such models is also based on sampling and interpolation of 
the spectrum, but is motivated by computational requirements for Wiener- Hopf optimization. 
From the above discussion of causal projection we motivate a class of algorithms of interest 
from basic properties of the Hilbert transforms applied to the frequency response Recall 

that the Hilbert transform of a time signal /(*) is defined as a convolution; /(<) = JZo iqSj dr 
and it’s Fourier transform has the property, 


/V) 


u> > 0 
u> > 0 


The inverse Fourier transform of $ is -jsgn(t)<f>{t) where <f>(t) is the inverse Fourier transform 
of $(<*;). A consequence is that the casual projection can be obtained as 

{$(u>)}+ = ^[$(u>) + j$(u>)]. 

In previous studies we reported computational algorithms for causal projection and scalar 
spectral factorization by numerical evaluation of the Hilbert transform integral. Computa- 
tional cost was high due to the fact that the Hilbert transform integral is convergent only in 
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the Cauchy principal value sense [13]. An alternate method for causal projection and spectral 
factorization was considered in [14] based on sampling and interpolation of the system fre- 
quency response. The algorithm developed in [14] employs results of Stenger [15] on numerical 
solution of Wiener-Hopf integrals by sampling and interpolation. Details of the algorithm used 
for the current studies and computer implementation are given in [14], 

In the multiloop, optimal regulator design problem we require the solution of two matrix 
spectral factorization problems (analogous to the solution of control and filtering Riccati 
equations for time domain models). The computational approach exploited in the current 
study is based on a Newton-Raphson iteration for the matrix causal spectral factor; 

F n+ i(t u>) := {[F*(iu>)] _1 £f(tw)[F n (tu;)] _1 | + F n {iw). (3.3) 

The recursion (3.3) can be replaced with a numerically well conditioned problem by iteration 
on the inverse spectral factor; 

[*.«]"' := W (/+ {rcj-'tf [FJ- 1 - /} + )"‘ ■ (3.4) 

By initializing with F 0 (an m x m diagonal matrix) with diagonal elements equal to the spect ral 
factors of the diagonal elements of H the second term of (3.4) remains a perturbation of the 
identity (since [F^] 1 #[F n ] -1 — / — > 0) which regularizes the computations. The algorithm 
used in this work is based on that reported in [14] and is a modified form of the method 
reported in [16]. 

Computation of stable coprime factorizations for flexible structure models. Simple 
models of structural components with elastic effects typically lead to transfer functions in 
HooOnce realistic damping models are included. Linear vibration models of more complex 
structures arising in aerospace applications usually will have transfer function models with 
only a finite number of poles in the closed right half plane. Restricting attention to such 
transfer functions we indicate a simple procedure for coprime factorization over H^. 

Let 5 C be a set of transfer functions analytic in a half plane including C+. Under 
the above assumption any such transfer function F(s) can be expressed in the form, 

P(s) = P s (s) + P s (s) ( 3 . 5 ) 

where P s € 5 C H OO and P$ is rational and analytic in the complement of S with (a finite 
number of) poles outside 5. A stable coprime factorization can be readily obtained for the 
(typically low order) transfer function as, Pg = N r D r *, by well known state space construc- 
tions [3]. Then P has stable coprime factorization, 

P = N r D~ l = [N r - P s D r \b;\ (3.6) 

where N rt D r are 5-stable. The separation of terms in (3.5) is readily carried out given P(s) 
by computing the residues of the finite number of unstable poles contributing to Pg. 
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4 Control Computations for an Elastic Structure 


To illustrate the computational approach for a simple elastic structure we consider the simply 
supported Euler beam with torque control at one end. The beam lateral deformation is given 
by y(t, z) with 0 < z < L and has dynamics described by the dimensionless PDE; 


<h f 

dt 2 


-2C 


d 3 y 

dtdz 2 


+ ?y = o. 

dz « 


(4.1) 


with boundary conditions at z = 0, 


v(t, 0) = 0, 


d 2 y 

dz 2 


= 0, 
*=o 


and at z = L, 


y{t , L) = o, 


with the control moment, r, applied at the right 
(r — » y) for beam control is 


d 2 y 

dz 2 


= r. 




hand end of the beam. The transfer function 


n 


cin stnVi 


(4.2) 


where Aj = (-£ + W 1 - C 2 )sL 2 , \\ = « + i\A ~ C ? ) a ^ 2 > L is the beam length, ( is the 
damping factor, and z is the observation point to be regulated on the beam. The transfer 
function is meromorphic, and z) £ Hoc f° r an y 0 < z < L. 

The regulator problem considered arises from a requirement for asymptotic rejection of 
constant load disturbances at a point jr = 0.7. For the current numerical studies we take 
L — 10., and the effective damping ratio, ( — 0.01. Stable coprime factorization is trivial 
and we take AT r = N t — G yu , D r = D t = 1. Exogenous inputs here include the output 
load disturbance d. and measurement noise model n and are described by their effective PSD 
models representing constant (step) load disturbance and narrowband sensor noise as shown in 
Figure 4.2. The frequency response of G ^ is shown in Figure 4.1 with 1024 uniform frequency 
samples over a bandwidth of 0 < u> < 100. Clearly, the frequency response is irrational and 
no obvious rational approximation is evident. 

The optimal control design is regulation of the beam deflection at z/L = .7 and the 
performance objective is given as, 

J f . Tr{$ v + n$ u }d3 

2nj J —j oo 

where the tracking cost is modeled by PSD, $ y , and the control saturation PSD is = 
E{uu.}. Then given a constraint on the control power the scalar ft. > 0 plays the role of a 
Lagrange multiplier for the optimal design. In this case the required spectral factors; 


(Gyu+Gyx, 4" /<•) — A. A 

(G^Pyd. + $„) = Dft, 


(4.3) 

(4.4) 
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Figure 4.1: Frequency response for pinned-pinned beam control. 


were obtained for p = .1 by the frequency sampled procedure and are displayed in Figure 4.3. 
We remark that the computations of the indicated spectral factors effectively replace the com- 
putational step of solving a pair Riccati matrix equation for the control (resp. filter) problem 
typically encountered in state-space methods for control design. For distributed parameter 
systems, solution of the Riccati equation (a PDE) requires discretization which is accom- 
plished using the current algorithms by sampling and interpolation of the frequency response. 
Thus numerical precision is concentrated over frequency bands significant for the given control 
problem and with sampling under direct control of the designer. The solution obtained is ef- 
fectively a high order rational approximation of the optimal solution with frequency response 
interpolation points chosen by the design engineer. The optimal controller frequency response 
thus obtained is shown for p = .1 in Figure 4.4. 

5 Frequency Sampling Filters for Real Time Control Implemen- 
tation 

The frequency response computations for Wiener-Hopf control outlined and illustrated in the 
previous sections identify various frequency sampled approximations to the ideal, possible 
irrational frequency response for the desired optimal controller. Bandwidth and sampling can 
be chosen by the design engineer to represent specific concerns based on models and/or control 
performance. The frequency sampled computations obtain a specification for the frequency 
response of the ideal (optimal) controller via its sampled representation. The design engineer 
now has several options for implementing the controller depending on available hardware. In 
contrast to the state space approach for finite dimensional systems, several new realization 
opportunities are suggested by the frequency sampling approach. 
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Figure 4.2: PSD for disturbance and sensor noise inputs for beam control. 


A principal concern in implementation of high precision control laws for flexible structure 
control is the order of the realization required for the online controller. The controller or- 
der is usually taken to mean the dimension of the state variable realization of the transfer 
function C'(s) which will be implemented for realtime control. Implementation using analog 
components of high order models is limited by circuit complexity, reliability and cost. As 
a result considerable effort has been expended in methods for model order reduction. One 
approach to controller realization which follows from the frequency sampled computations of 
the previous section is to compute reduced order, continuous time, state space realizations for 
the controller by techniques such as in [12]. 

Digital computer implementations are primarily limited by computational speed and al- 
gorithm complexity effecting the ultimate obtainable sampling rate and considerations for 
reduced order realization of the controller may be required. However, the emergence of special- 
ized computer hardware implemented in VLSI single chip circuits for digital signal processing 
opens new opportunities for realization of realtime control for flexible structures. We prefer 
to consider realization options for the optimal controller in discrete time for implemenation 
on a digital computer. Realization of the controller specified by its frequency samples can be 
obtained using a FIR digital filter implementation. 

Given the specified frequency samples obtained for the optimal controller, 

Ck = C(M), 

at frequencies, u> k = kuiBw/N, where a >bw is bandwidth and N the number of uniformly 
spaced frequency samples we describe the digital filter realization using z-transforms. With 
discrete time sampling rate u>, > 2wbw the frequency samples correspond to interpolation 
points in the z-plane given by 3 , z k = e jk2 * /N ' , for k = 0, . . . , N - 1. The z transform which 

3 Bandwidth and sampling requirements would typically require padding the sequence of frequency samples 
of length N with Ni - N zero values to avoid aliasing. 
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Figure 4.3: Spectral Factors for pinned-pinned beam control. 
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realizes the frequency sampling filter is 

C{z) = Y, CkFu(z) 

k=0 


where the interpolating functions are, 

1 - z-* 
“ N{ 1 - 


with k = 0, . . . , N — 1. A standard computation shows that the frequency sampling filter has 


transfer function 


C(z) 


1 - z 


-N 


N 


N-l 

£ 


fc= o 


Cj, 

l _ e jk2n/N Z -1 


If-1 

£ 


*=o 


(5.1) 


where the coefficients, 

=< 41 ^^ ( 5 - 2 ) 

™ k—0 

f or £ — 0, . .. ,N — 1, are the Inverse Discrete Fourier Transform (IDFT) of the sequence 
Co, . ■ ■ ,Cn- i- The final form in (5.1) shows that the realization is a FIR realization. Such 
realizations are nonrecursive and are efficiently implemented using high speed, single chip DSP 
processors which utilize highly pipelined architectures to achieve high throughput. 


6 Conclusions and Directions 

Wiener-Hopf optimization of frequency domain models has been shown to offer significant 
advantages for computation of precision controllers for irrational transfer functions arising 
in control of flexible structures. Computational algorithms for causal spectral factorization 
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and causal projection can be implemented based on frequency sampled representation of the 
model response. Such models can be obtained from transfer function models or from frequency 
response measurements of the controlled structure. Computations based on frequency response 
sampling have been demonstrated for irrational transfer function models arising in the control 
of flexible structures. 

Requirements for precision control will involve frequency response models which are char- 
acterized by a large number of flexible modes within the control bandwidth. However, for 
control of relatively large, flexible space structures control bandwidth and resulting sampling 
requirements for discrete time control implementations are well within the state-of-the-art for 
high speed digital computers. Frequency sampling filters based on nonrecursive implementa- 
tions can be efficiently implemented in modern DSP single chip processors for realtime control 
of such systems. 

Application of FIR realizations for realtime, closed loop control have not received much 
consideration in the literature primarily due to increased phase lag by comparison with a 
recursive realization. However, the rapidly developing technology for realtime DSP using 
special purpose architectures offers throughput capabilities which may reduce the achievable 
computational delay to within acceptable limits for certain applications. In such cases, high 
order realizations may be feasible using nonrecursive implemenations which cannot be realized 
by recursive methods. 
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Abstract 

The theory of polynomial matrices plays a key role in the design 
and analysis of multi-input multi-output control and communications 
systems using frequency domain methods. Examples include coprime 
factorizations of transfer functions, canonical realizations from matrix 
fraction descriptions, and the transfer function design of feedback 
compensators. Typically, such problems abstract in a natural way to 
the need to solve systems of Diophantine equations (the so-called 
generalized Bezout equations) or systems of linear equations over 
polynomials. These and other problems involving polynomial 
matrices can in turn be reduced to polynomial matrix 
triangularization procedures, a result which is not surprising given 
the importance of matrix triangularization techniques in numerical 
linear algebra. There, we deal with matrices with entries from a field 
and Gaussian elimination plays a fundamental role in understanding 
the triangularization process. In the case of polynomial matrices we 
are dealing with matrices with entries from a ring for which Gaussian 
elimination is not defined and triangularization is accomplished by 
what is quite properly called Euclidean elimination. 

Unfortunately, the nmerical stability and sensitivity issues which 
accompany floating point approaches to Euclidean elimination are not 
very well understood at present. In this paper we present new 
algorithms which circumvent entirely such numerical issues through 
the use of exact, symbolic methods in computer algebra. The use of 
such error -free algorithms guarantees that the results are accurate to 
within the precision of the model data— the best that can be hoped. 
Care must be taken in the design of such algorithms due to the 
phenomenon of intermediate expression swell, the price paid for 
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Abstract 

Thirty years of control systems research has produced an 
enormous body of theoretical results in feedback synthesis. Yet such 
results see relatively little practical application, and there remains 
an unsettling gap between classical single-loop techniques (Nyquist, 
Bode, root locus, pole placement) and modern multivariable 
approaches (LQG and H~ theory). Large scale, complex systems, such 
as high performance aircraft and flexible space structures, now 
demand efficient, reliable design of multivariable feedback 
controllers which optimally tradeoff performance against modeling 
accuracy, bandwidth, sensor noise, actuator power, and control law 
complexity. This presentation will describe a methodology which 
encompasses numerous practical design constraints within a single 
unified formulation. The approach, which is based upon coupled 
systems of modified Riccati and Lyapunov equations, encompasses 
time-domain linear-quadratic-Gaussian theory and frequency-domain 
H theory, as well as classical objectives such as gain and phase 
margin via the Nyquist circle criterion. In addition, this approach 
encompasses the optimal projection approach to reduced-order 
controller design. The current status of the overall theory will be 
reviewed including both continuous-time and discrete-time 
(sampled-data) formulations. The presentation will focus on the 
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Abstract 

The Robust- Control Toolbox [1] is a collection of 40 w M-files” which ex- 
tend the capability of PC/PRO-MATLAB to do modem multivariable robust 
control system design. Included are robust analysis tools like singular values 
and structured singular values, robust synthesis tools like continuous/ discrete 
H 2 /H°° synthesis and LQG Loop Transfer Recovery methods and a variety of 
robust model reduction tools such as Hankel approximation, balanced trunca- 
tion and balanced stochastic truncation, etc. 

In this paper, we will describe the capabilities of our toolbox and illustrate 
them with examples to show how easily they can be used in practice. Examples 
include structured singular value analysis, H°° loop-shaping and large space 
structure model reduction. 


1 Introduction 

The fundamental issue in robust control theory — to find a stabilizing controller that 
achieves feedback performance despite the plant uncertainty, is still the same issue 
addressed by the classical 1930’s feedback theory of Black, Bode and Nyquist (ref. 
Fig. 1.1). Modern robust control theory has resolved many of the issues concerning 
the “gap” between the theory and practice that had grown to troublesome proportions 
in the 1970’s. One key to bridging the “gap” has been the singular value Bode plat. 
Recent progress in Structured Singular Value (SSV), H°° optimal control theory and 
the model reduction techniques utilizing singular values have made the modern robust 
control theory highly practical. 

The inavailability of quality software implementing the techniques of robust con- 
trol theory has, until very recently, significantly limited the access of both researchers 
and engineering practitioners to these techniques. 
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Fig. 1.1 Robust Control Problem. 


Our Robust- Control Toolbox implements the most up-to-date robust control theory 
like Perron SSV, optimal descriptor 2-Riccati H°° formulae, LQG/LTR and singular 
value based model reduction techniques, ... , etc. The toolbox consists of a library 
of 40 functions which extend the capabilities of PC /PRO — MATLAB™ and the 
PC / P RO — M AT LAB Control Toolbox. The toolbox itself represents four man years 
research work done at USC. 

These 40-functions can be catalogued into 3 major areas: 

• Robust Analysis 

— Singular Values 

— Characteristic Gain Loci 

- Structured Singular Values 

• Robust Synthesis 

— LQG/LTR, Frequency- Weighted LQG 

- H 2 , H°° 

• Robust Model Reduction 

— Optimal Descriptor Hankel (with Additive Error Bound) 

— Schur Balanced Truncation (with Additive Error Bound) 
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- Schur Balanced Stochastic Truncation (with Multiplicative Error Bound) 

In this paper, we will highlight the most important functions in the toolbox, 
demonstrate how easily they can be used and show what kind of results can be 
achieved with practical problems. As for the details of the modern robust control 
theory, they can be found in [7], [2] and the references therein. 


2 Robust Analysis 

The objective of robust analysis is to find a proper measure of the multivariable 
stability margin (MSM) against uncertainty. Uncertainty may take many forms, 
but among the most significant ones are noise/disturbance signals, transfer function 
modeling errors and unmodeled nonlinear dynamics, etc. Uncertainty in any form is 
no doubt the major issue in most control system designs. 

Several tools to measure MSM are available: [1], [8] 

• Singular values (Safonov, 1977; Doyle, 1978) 

• Perron eigenvalues (Safonov, 1982) 

• Diagonal scaling via nonlinear programming (Doyle, 1982; Tekawy et al., 1989) 
Let’s define the MSM first: 

Definition 1 Multivariable Stability Margin (MSM)K m , n(-)~ l 

^m(2yu) — 1 = inf {cr(A)\det (I — TyuA) = 0}. 

In other words, it’s the smallest ZF(A) that can make the determinant (/ — Tyu A) 
singular (or the closed-loop system unstable). See Fig. 1.1. 

A theorem summarizes the whole MSM idea: 

Theorem 1 The system is stable for all stable A j with HA^oo < 1, if the MSM 

Km(Tyu)> 1. 

Unfortunately, exact computation of K m (or fi~ l ) would require solution of a non- 
convex optimization problem and is therefore impractical. Fortunately, computable 
upper bounds on K m are available, viz., 

< mf HC.KJiuJB-'IU « P-ll- 

where D := {diag(d\I , . . . ,d n I)\di > 0}. 

Then using these upper bounds (some may be more conservative than others), one 
can assure that the system is stable against the norm-bounded uncertainty UAH*, < 
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A comparison of the available upper bounds reveals that some are much easier to 
compute than others. See table 2.1. 


Table 2.1 


Method 

Property 

Computation 

Reference 

Optimal Diagonal 
Seeding 

n — 3, exact 
n > 3, 3 15 % gap 

demanding 

Doyle, 1982 
Tekawy et al., 1989 

Perron Eigenvector 
Diagonal Scaling 

very close to optimal 
diagonal scaling 

easy 

Safonov, 1982 

Singular Value 

can be very 
conservative 

easy 

Safonov, 1977 
Doyle, 1978 


Let’s see the following example. 

Example: Given a system G(s) with multiplicative uncertianty at its input. Find 


the MSM. 


G(s) 


»+4 

4 j , St 8 

»+4 ' *+8 *+8 


Theorem 1 implies HAH,*, < (||C?(J + G ) 1 || 0 o) '• 



Wadi Sec 


Fig. 2.2 Singular Value vs. K m (upper bound). 

This example reveals that the singular value upper bound is too conservative in 
“robust analysis” . Whereas, Perron SSV is much simpler to compute than diagonally 
seeded nonlinear programming /*. 

3 Robust Synthesis 

Classical control system designers often do “loop-shaping” to meet design specifi- 
cations. So do modern robust control system designers. “Loop-shaping” for mul- 
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tivariable systems is done via the singular- value Bode plot. However, to shape the 
loop transfer function L(s) is nothing but to shape the sensitivity function S(s) = 
(/ + L(s)) -1 and the complementary sensitivity function T(s) — L(s)(I + L(s))~ 1 . 
See Fig. 3.1 


*0-H uo 



Fig. 3.1 SISO & MIMO Loop-Shaping. 

There are several loop-shaping methods available in the Robust- Control Toolbox 


(see Table 3.1), but H°° is one of our favorites. 

Table 3.1 


Methods Advantages 


LQR • guaranteed stability margin 

(Iqr.m) • pure gain controller 


LQG 

(Iqg.m) 


• use available noise data 


LQG/LTR • guaranteed stability margin 
(Itru.mjtry.m) • systematic design procedure 


• address stability and sensitivity 

• almost exact loop shaping 

• closed-loop always stable 


• address stability and sensitivity 

• exact loop shaping 

• direct one- step procedure 



Disadvantages 


o need full-state feedback 
o need accurate model 
o possibly many iterations 


o no stability margin guaranteed 
o need accurate model 
o possibly many iterations 


o high gain controller 
o possibly many iterations 
o design focus on one point 


o possibly many iterations 


H°° 

(hinf.m) 



Example: Classical loop shaping vs. H°° for 2nd order low-damped system. 

Given a plant G(«s) which is 2nd order with damping 0.05 at 20 rad/sec, find a 
controller to meet frequency response Bode plot (see Fig. 3.2) 
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PLANT OPEN LOOP & SPECIFICATION 



Fig. 3.2 2nd order plant open loop (£ : 0.05,0.5) and the L(s) spec. 

A classical design might be decomposed into the following: (see Fig. 3.3) 

Step 1: Rate feedback to improve damping. 

Step 2: Design high frequency (phase margin, BW, roll-off..). 

Step 3: Design low frequency (DC gain, disturbance rejection..). 

The classical result is shown in Fig. 3.4. Now, let’s see how H°° approaches the 
problem. 


H°° Problem Formulation 

We are solving the so-called H°° Small-Gain Problem ([3]) using the numerically 
robust “optimal” descriptor 2-Riccati formulae of Safonov, Limebeer and Chiang [4] 

[5]. 

H°° Small-Gain Problem: 

Given a plant P(s) (ref. Fig. S.5), find a stabilizing controller F(s) such that 
the closed-loop transfer function T ViUl is internally stable and its infinity-norm is less 
than or equal to one. 

But what makes H°° work is its unique and remarkable “all-pass” property: 

At H°° optimal, the frequency response of T yi Ul is all-pass and equal to one (i.e., 

PUII^i;/ 

This means that designers can achieve EXACT frequency domain loop-shaping via 
suitable weighting strategies. For example, one may augment the plant with frequency 
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dependent weights W i, W 2 and W 3 as shown in Fig. 3.6. Then, the Robust- Control 
Toolbox functions augtf.m and augss.m will perform the augmentation and create a 
state-space for function hinf.m to find an H°° controller. Of course, these frequency 
weighting functions have to be chosen so that a stabilizing solution satisfying the H°° 
norm constraint exists. 

In a typical application, either W 2 (s) or Wa(a) would be absent, leading to weighted 
H°° costs of the forms 


min 

F(.) 


w 3 t 


< 1 

oo 


or 


min 

F(,) 


w,s 

w 2 fs 



< 1 . 


In our example, the frequency domain spec, can be split into Wi and W 3 : 


wr 1 = P 


T A ) 


100(0.005* + l) 2 ’ 




as shown in Fig. 3.7. 


SPECIFICATION — > WEIGHTING STRATEGY 



Fig. 3.7 Weighting strategy for 2nd order problem spec. 

The results are shown in Fig. 3.8 for different p’s. Clearly, in the limit (as p 
goes to 3.16) the cost function becomes “all-pass”. The parameter p of Wi is the 
only parameter on which we iterate for design; the Robust- Control Toolbox script-file 
hinfgama.m automates this iteration. 
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Rad/Sec 

Fig. 3.8 H°° results for 2nd order system 
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3.1 H°° Software Execution and Sample Run 

To do an H°° control design with the Robust-Control Toolbox is relatively simple. 
Table 3.2.1 shows the complete user inputs for our sample problem. 

Table 3.2.1 

>> nug = [0 0 400]; dng = [1 2 400]; 

» [ ag,bg y cg,dg ] = tflss(nug,dng)\ 

» sysg = [ag bg;cg dg ]; xg = 2; 

>> wl = [2.5e - 5 l.e — 2 1; 

0.01 * [4.e - 2 4.e — 1 1]]; 

>> w2 = [ ]; 

» w3 = [1 0 0; 0 0 40000]; 

>> [A,B1,B2,C1,C2,D11,D12,D21,D22] = augtf(sysg,xg,wl,w2,w3); 

>> hinf 

Table 3.2.2 shows the output which appears on the screen for a successful run of 
hinf.m. 


Table 3.2.2 



<< H-inf Optimal Control Synthesis » 



Computing the 4-block H-inf optimal controller 
using the S-L-C loop-shifting/descriptor formulae 

- - - Solving f or the H-inf controller F(s) using U(s 

) * 0 (default) - - - 

Solving riccati equations and performing H-infinity 

existence tests: 


1. 

Is Dll small enough? 

OK 

2. 

Solving state-feedback (P) riccati . . . 



a. No Hamiltonian jw-axis roots? 

OK 


b. A-B2*F stable (P >« 0)? 

OK 

3. 

Solving output-injection (S) riccati . . . 



a. No Hamiltonian jw-axis roots? 

OK 


b. A-G*C2 stable (S >= 0)? 

0K 

4. 

max eig(P*S) < 1 ? 

OK 

all 

tests passed — computing H-inf controller 

• • • 


DONE! ! ! 


— » 
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4 Robust Model Reduction via Basis-Free Tech- 
niques 

In the design of controllers for complicated systems, model reduction arises in several 
places: 1). Plant model reduction, 2). Controller model reduction, 3). Simulation of 
large size problem. 

However, naive implementations of model reduction methods such as Rosenbrock’s 
stair-case algorithm, Moore’s balanced truncation, optimal Hankel approximation and 
balanced stochastic truncation, etc., can fail on even relatively simple problems due 
to numerical instability. 

The Robust- Control Toolbox implements the “basis-free” version of the latter three 
of the model reduction techniques, which are not only numerically robust but also 
tend to achieve the ultimate result for practical problems. In particular, they all 
possess the following special features: 

1. They bypass the ill-conditioned balancing transformation, so that they can eas- 
ily deal with the “non-minimal” systems. 

2. They employ Schur decomposition to robustly compute the orthogonal bases 
for eigenspaces required in intermediate steps. 

These methods all enjoy attractive L°° error bounds - either an additive error 
bound or a multiplicative error bound. 

• Additive Methods: 

— Optimal descriptor Hankel MDA (ohklmr.m). 

— Schur balanced truncation (balmr.m, schmr.m). 

• Multiplicative Method: 

— Schur balanced stochastic truncation (reschmr.m). 

4.1 Robust Model Reduction Theorems 

For robust control system design, it is desirable that the reduced order model satisfies 
the conditions of one of the following two theorems (ref. [1] [2]). Otherwise, the 
controller design based on the “blind”reduced order plant model can be unstable ! 

Theorem 2 Additive Robustness Theorem; (see Fig. \.l) 

If &{Aa) < for u) < u) r (with and G open loop stable), then the closed- 

loop system will be stable provided that the control bandwidth is less than ut r , where 
u* T := max{ u> \ <r(G(juj)) > d^A^ju/))}. 
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TRUE PLANT G(s) 



Fig. 4.1 Additive modeling error. 

Theorem 3 Multiplicative Robustness Theorem: (see Fig. J^.2) 

If <t(Am ) < 1 for (jj < <jj r (with A m and G open loop stable), then the closed-loop 
system will be stable provided that the control bandwidth is less than u > T , where u> r := 
max{u> | a(K M {jw)) < 1}. 


TRUE PLANT G(s) 



Fig. 4.2 Multiplicative modeling error. 


4.2 Examples of Model Reduction 

Example 1: Find a 3-state reduced order model for the transfer function 

0.05(a 7 + 801s 6 + 1024s 6 + 599s 4 + 451s 3 + 119s 2 + 49s + 5.55) 
~ s 7 + 12.6s 6 + 53.48s 6 + 90.94s 4 + 71.83s 3 + 27.22s 2 + 4.75s + 0.3 

with Hankel singular values of the phase matrix 


01 

02 

03 

0< 

06 

0^6 

<T 7 

0.9959 

0.9972 

0.9734 

0.7166 

0.5635 

0.0021 

0 
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The results produced by the Robust-Control Toolbox functions — ohklmr.m, schmr.m 
reschmr.m, are shown in Fig. 4.3. Clearly, the Schur BST-REM method that keeps 
the reduced model staying inside a prescribed relative error bound produces the ul- 
timate result in model reduction. Note that 07 = 0 indicates only the basis-free 
methods can handel the problem without numerical difficulty. 

Example 2: Model reduction for a large space structure [6] (see Fig. 4.4). 

Our design reequirement is to find a controller to track LOS loops in 300 Hz BW 
and to reduce plant disturbance response by a factor of 100. 

The Hankel singular values after the inner loops are closed indicate that the system 
is non-minimal. Therfore, only the “basis- free” methods such as - Schur BT (schmr.m) 
or Schur BST-REM (reschmr.m) can be used. Again, only the Schur BST-REM can 
match the original model up to a “robust frequency” which is high enough so that 
the required BW of 300 Hz can be satisfied (see Fig. 4.5 & Fig. 4.6). 



Fig. 4.4 Large space structure. 


3i 


SV(db) SV (db) 


LSS 4 -fuse Schur-BT model vs. tool error 


LSS 4-suue Schur Bt model vs. 1 16-«u«e original 




Fig. 4.5 Model reduction using Schur Balanced Truncation. 


LSS 4-ctaie Scfair BST-REM model n 1 16-state ar\g LSS 4-stme Schur BST-REM model vs. local error 




Fig. 4.6 Model reduction using Schur Balanced Stochastic Truncation. 
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5 Importance to the Control Community 


If the ultimate goal of research is to apply the theory to reality, then the contribution 
of Robust- Control Toolbox is clear: 

It provides a vital bridge between modem robust control theory and real control appli- 
cations. 

The following diagram (Fig. 5.1) shows how different groups of people with dif- 
ferent backgrounds can utilize the Robust- Control Toolbox to achieve their personal 
goals. For example classical control designers can use the toolbox to understand the 
theory or to apply on a design. Non-robust control researchers can study the code pro- 
vided by the toolbox together with the papers referenced therein to become familiar 
with the robust control theory. Students can use the toolbox either for robust control 
research or for realistic design studies. It seems that the Robust- Control Toolbox can 
serve people from a variety of backgrounds. 


Robust 

Control 

Toolbox 




Fig. 5.1 The importance of the Robust- Control Toolbox. 
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6 Summary 

The Robust- Control Toolbox 

• Provides a bridge between modern robust control theory and the real-world 
applications. 

• Has the most up-to-date Robust- Control theories and algorithms. 

• Is in readable M-files, so it’s educational. 

• Contains the tools one needs to do robust control system design, analysis and 
model reduction. 

• Is direct, powerful and easy-to-use. 
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Abstract 

Recent advances in optimal control have brought design techniques based on optimization of H 2 
and H~ norm criteria, closer to be attractive alternatives to single-loop design methods for linear 
time-invariant systems. Significant steps forward in this technology are the deeper understanding of 
performance and robustness issues of these design procedures and means to perform design trade- 
offs. However acceptance of the technology has been hindered by the lack of convenient design tools 
to exercise these powerful multivariable techniques, while still allowing single-loop design 
formulation. Presented in this paper is a unique computer tool for designing arbitrary low-order 
linear time-invariant controllers that encompasses both performance and robustness issues via the 
familiar H 2 and H“ norm optimization. Application to disturbance rejection design for a commercial 
transport is demonstrated. 

I. Introduction 

Past three decades have laid a foundation on the theory of optimal control. Issues have been 
actively pursued in algorithms for numerical solution of optimum designs, feedback properties of 
optimal linear feedback (and feedforward) controllers and associated theoretical results of existence 
and uniqueness. Filtering of these wealth of technology down to current practitioners have been 
agonizingly slow. Demonstration and acceptance of these design techniques in typical flight systems 
such as SAS (stability augmentation systems), manual controls and autopilot designs, are almost 
non-existent. Hindrances in this effort are related to concerns raised in the following areas: design 
simplicity, ease-of-modification during flight-test and incorporation of designers' intuition and 
experiences in these "optimum" systems. Presented in this paper is the development of a design tool 
that covers much of the advances in multivariable controls and its potential application to flight 
controls. 

II. Background and Motivation 

Historically multivariable controls have been extensively developed based on optimal control of 
linear time-invariant systems. Class of design problems addressed in the past are optimal linear 
regulator using full-state feedback or estimate-state output feedback. Research efforts to extend the 
usefulness of multivariable control designs within the reach of experienced control designers are 
concentrated in the following areas: 

• Measures of design robustness in the presence of modeling uncertainties'^ 1-21 , 


t Numbers indicate references. 
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• Synthesis methods to achieve trade-offs in performance and robustness 7 - 22 ' 35 * 

• Model and controller order reduction 36 - 39 , 

• Direct synthesis of reduced-order controllers of given structure to achieve trade-off in 
closed-loop performance and robustness, and at the same time facilitate design integration 
over different operating conditions 23 . 

A well-known property of feedback concept is its ability to regulate in the presence of plant 
uncertainties. Measures of robustness are traditionally based on loop stability margins 40 (i.e. phase 
and gain margins of individual control loops while maintaining other loops closed at nominal gains). 
These single-loop robustness tests provide useful design criteria for the evaluation of current flight 
control systems. Recent development in robustness analysis techniques allow designers to examine 
design sensitivity in the multiloop and multivariable settings. To detect conditions for design 
sensitivity, one makes use of the singular values of loop return-difference matrix at appropriate plant 
input/output locations. In addition ji-measure is defined to characterize design robustness to 
uncertainties that are expressed in terms of plant parameter variations or those that have a 
predetermined structure. The latter robustness measure, p-measure, is difficult to evaluate exactly; 
but it provides the most accurate description of design robustness in the presence of structured 
uncertainties. Current research direction is to devise numerical schemes that approximate closely 12 
(i.e. providing "least" conservative upper and lower bounds) or, exactly calculate the p-function for 
some specific types of structured uncertainties 13 ' 21 . Synthesis methods to improve design 
robustness based on a general p-measure are not available. 

Guaranteed robustness of (-6dB,°°) in gain margins and (-60* ,60*) in phase margins from 
optimal linear regulator have motivated researchers in developing design procedures to retain or 
recover these robustness properties for state-estimate feedback controllers 7 (i.e. as in optimal linear 
quadratic gaussian (LQG) designs). Fundamental understanding in the robustness recovery process 
resulted in design methods classified under the category of LQG/LTR where, for example, loop 
properties of a full-state feedback regulator are asymptotically recovered with appropriate selection of 
process noise models 41 , or in the framework of "asymptotic" pole assignment 42 . One possible 
drawback of these procedures is the tendency of having unnecessarily high gains in the estimator 
design during the loop transfer recovery. 

Controllers obtained from a LQG/LTR design procedure are usually of high order (i.e. order of 
the controlled plant model augmented with models for actuation, sensor and design-shaping filter 
dynamics). Implementation of these controllers in digital flight processors may be feasible with 
anticipated advances in computing technology; but will undoubtedly be challenging from the point of 
view of design traceability, reliability and maintainability. However it is often possible to reduce the 
controller order using standard model reduction techniques such as modal residualization 43 , balanced 
truncation 44 and optimal Hankel norm approximation 37 . Careful considerations must be made 
especially in the reduction of controllers so that closed-loop stability, performance and robustness are 
preserved. Frequency-weighted reduction schemes have been developed to address these issues with 
some success 39 . 

A remaining problem is the final integration of "suboptimal" reduced-order controllers to operate 
over a wide range of flight conditions. This is usually achieved in an ad-hoc manner (e.g. curve- 
fitting gain parameters optimized at individual flight conditions as a function of some physical 
quantities such as calibrated airspeed, aircraft weight, aircraft center of gravity and dynamic 
pressure). 

With some of the above outstanding issues unresolved, it is evident that wide acceptance of these 
design techniques has been difficult Additional reason behind this difficulty in technology transfer is 


313 


the lack of intuitiveness in the design approach to handle low-order controllers of given (i.e. 
predetermined) structures (e.g. washout filter in the yaw damper design of a lateral control system 
for turn-coordination, simple a-priori gain scheduling according to physical quantities such as aircraft 
weights, dynamic pressure and calibrated airspeed, synthesis of dedicated structural filters for 
control of lightly passively damped elastic modes, etc...). 

Research in direct synthesis of reduced-order controllers for multivariable controls are being 
actively pursued and have resulted in some promising design algorithms both in the continuous- 
time 23 - 32 and discrete-time^'^ domains. Although theoretical development of the design techniques 
have made significant strive, insights in applying them to the synthesis of practical flight control 
systems are yet to be established. To facilitate the evaluation and technology transfer of multivariable 
control design concepts, there is a need for an efficient and versatile design tool that is able to resolve 
the above issues related to design implementation, performance, robustness and integration (i.e. gain 
schedule) over a wide range of operating conditions. 

III. Design Tool Development 

The objective of the design tool is to provide a unified framework for applying recent advances in 
robust multivariable controls to flight systems. To achieve this goal, development of efficient and 
versatile computational algorithms is needed. Scope of the design concept and procedure will 
hopefully enable and motivate experienced designers to appreciate the importance and value of 
multivariable controls. Steps taken to accomplish the stated objectives are as follows: 

• Formulation of control design problems and solution algorithms for the synthesis of robust 
low-order controllers, 

• Implementation of design algorithms in useful CAD tools for ease in obtaining design 
solutions to a wide class of linear feedback/feedforward controllers, 

• Ability to formulate other design specifications using linear and nonlinear equality and 
inequality constraints. 


Control Design Problem: Multivariable controls have primarily focused on applying 
optimization to the design of control systems. Extensive work conducted to-date are on control of 
linear time-invariant systems. The problem is the synthesis of linear controllers that meet specific 
closed-loop performance and robustness over a range of linearized plant conditions. Surprisingly this 
problem is identical to the one that experienced designers have to confront in their daily design work 
where traditional single-input single-output (SISO) methods prevail. Inadequacies of these SISO 
design techniques are well-known: neglect of cross-coupling effects, difficulty to satisfy multiple 
design requirements, highly dependent on the designers’ experience, trial-and-error. On the other 
hand, advantages behind SISO design procedures are: the simplicity of its final controller structure, 
the ease-of-incorporating designers' experience into the design process and design flexibility for 
post-flight test modification. A useful design tool would combine these existing SISO design 
features into multivariable control synthesis. 

Design Procedure Based on H 2 and H“-Optimization : Design methods for multivariable 
controls can be categorized into two general classes depending on whether the control-laws are 
synthesized based on minimizing a performance measure while satisfying other design constraints, 
or just simply meeting design constraints (e.g. eigenstructure assignment). 

In the category of performance-oriented methods, control algorithms are generally developed 
from optimization of some weighted norms of the plant outputs and control inputs in a closed-loop 
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system subject to deterministic or stochastic disturbances. Two commonly used measures are H 2 
and H~-norms with interpretation in both frequency and time-domains. Until recently, and mostly 
for mathematical convenience, majority of feedback and feedforward control-law synthesis are based 
on H^-norm. Associated design schemes are classified under the following methods: linear quadratic 
regulator (LQR), linear quadratic gaussian (LQG) design and, linear quadratic gaussian design with 
loop transfer recovery (LQG/LTR). 

Over the past decade, practitioners of these techniques have gathered valuable experiences in 
applying these techniques to flight controls. Iterative procedures have been developed to achieve 
trade-offs between performance and control band widths 54 - 55 . However implementation of these 
designs remains an area of concern and need further research development Often these designs with 
full-state feedback structure are implemented using a state estimator or observer. Procedures to 
obtain design robustness in state-estimate feedback are done through the mechanism of Riccati 
equation 41 or eigenvalue placement 42 starting from sufficient conditions for loop transfer recovery. 

These procedures offer valuable insights into the design feasibility based on requirements of 
closed-loop stability, performance and robustness. Unfortunately, difficulty in extending these 
results to encompass traditional design philosophy (e.g. output feedback, low-order controller with 
structure intuitive to designers, gain scheduling,...) remains. Attempts to fit these multiloop high- 
order controllers into low-order and conceptually simple designs using, for example, controller order 
reduction are not trivial and have been partially successful 39 - 43 . This remains to be an area of 
continued research interests. 

A completely different, direct and practical approach to multivariable controls would be via the 
route of parameter optimization. The control design procedure described in this paper is one of such 
methods. It is based on the optimization of an objective function using any pre -defined controller 
structure and subject to additional linear and nonlinear design constraints 56 . The formulation allows 
direct intervention of control designers through the set-up of the design objective function, the 
controller structure and constraints on closed-loop stability, performance and robustness to plant 
uncertainties. This design concept was originally developed in reference 23 for linear time-invariant 
systems using objective function based on EP-norm. The design algorithm was efficiently 
implemented into the computer-aided-design package SANDY. Evaluation of the objective function 
and its gradients with respect to the controller parameters are performed analytically for a 
diagonalizable closed-loop system. Subsequent improvement have been made in the area of 
numerical optimization (e.g. found in the constrained optimization code NPSOL 56 ), and in the 
development of typical constraints encountered in flight control systems such as closed-loop 
damping, covariance bounds on output and control variables. Encouraging results have been 
obtained in a variety of control design applications 22 - 26 * 28 - 31 - 48 . 

Reference 23 has also demonstrated the early application of such a technique in simple design 
situations. Later applications have been in the design of missile autopilot 27 , design of a reliable 
stability and command augmentation system for a commercial transport 22 * 26 , design of an improved 
lateral ride quality control system 48 - 49 . Usefulness of such a design tool has been further 
investigated in the control of a remotely-piloted vehicle 5 ®- 51 and nonrigid manipulators 52,53 . 
Reference 53 actually applied and verified in experiments results achieved using the design 
algorithm 23 to the synthesis of robust compensators for flexible structures with uncertain parameters. 

This research has led to the development of a unified multivariable control design concept that 
addresses virtually all flight control design problems such as stability augmentation systems, gust 
load alleviation, manual and automatic control modes. Typical flight control systems can be 
formulated exactly in the same situation as designers would when conducting designs using single- 
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loop frequency-domain techniques. However, in the design solution, multivariable control methods 
based on H 2 and H°°-optimization will be used instead to derive the appropriate design gains. Section 
V illustrates briefly the design philosophy in the synthesis of a longitudinal control system of a 
commercial transport. 

With this unique design concept developed for solution of optimal FP-norm type of problems, 
the work is later extended to address control issues related to H°°-norm 47 . The overall scope is to 
provide a unified design algorithm for low-order controller synthesis that utilizes criteria based on 
both H 2 and H°°-norms 57 ’ 58 . To achieve this objective an efficient numerical algorithm has been 
developed to solve the following optimal control problems: 

(a) Mixed H 2 and H ° °-Design Objective: Synthesis of feedback/feedforward controllers of fixed 
(i.e. arbitrary) order and structure is based on the minimization of the objective function J given by 
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Note that H zw (s) and H uw (s) are the transfer matrices between the disturbance inputs w(s) and the 
performance outputs z(s) and controls u(s) of the closed-loop system respectively. 

This formulation covers design criteria that are expressible either in terms of FP-norm or H°°- 
norm, or a combination of the two. Another feature of this set-up is its ability to address design 
robustness to plant uncertainties (e.g. structured and unstructured uncertainties at both the plant 
inputs and plant outputs, plant parameter variations) through the use of aggregated closed-loop 
responses over a set of plant conditions, signified by the summation index i (i=l,N p ) and N p is the 
total number of design conditions. An objective function that spans multiple plant conditions further 
provides a means to establish gain scheduling across the entire design envelope. 

Reference 23 demonstrated the usefulness of this design formulation in controlling an S^order 
flexible mechanical system under a non-collocated sensor/control configuration. A second-order 
controller has been designed that is robust to large variation in moment of inertia of one of the disks 
in a four-disk system 23 - 53 . The resulting robust controller turns out to be non-minimum phase. This 
result agrees with the SISO control synthesis procedure 59 for active vibration control. 

The design algorithms for evaluating both H 2 and H°°- norms use an equivalent time-domain 
characterization. The equivalence is established using the familiar Parseval theorem 60 . 

(b) H 2 Design Objective with H °°- Bound Constraints: Alternatively one can define the control 

problem being the minimization of an objective function J based on fP-norm, 

2 


N, 

1 = 1^2 


|Q il/2 Hi w (jco) 
i=i nRi^HiwO©) 
subject to additional constraints 
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for some positive scalar y, (i=l,N p ). Recent work in H°°-optimization 33 follow similar past 
development in optimal control for H 2 -norm problems to a single plant model. Algorithms have 
been developed for state-feedback and output-feedback controllers (of the same order as the plant 
model) that satisfy given H°°-bounds (e.g. equation (3)). These methods can be applied iteratively to 
yield solution of an H°°-optima] control problem. 

The resulting LQG-hke controllers that are solutions to the H°°-optimal control problems suffer 
the same drawbacks associated with traditional LQG controllers. Moreover, solutions of low-order 
controllers (i.e. strictly less than the order of the plant model) for H°°-optimization are still not 
available. Our method provides a convenient framework for H°°-optimization using the early design 
concept developed in reference 23 for low-order controllers. The outcome is a unified design 
procedure that allows control practitioners to examine requirements based on current findings in H~- 
bounds or other related measures (e.g. p-measure, worst-case perturbations in parametric 
uncertainties) for performance and robustness. 


Finite-Time Quadratic Performance Index : One unique feature of the design algorithms for 
H 2 and H°°-norm calculation is the usage of a finite-time quadratic performance index. The objective 
function J(t f ) (with a finite terminal time tf) for both L 2 and L°° norms is given by the following 
equations, 

• For H 2 -norm : 

• Random Initial Conditions: 


Np rtf 

J(tf) = X w Pi I e2ctt E[ z T (t)Q i z(t)+u T (t)R i u(t)]dt < J(tf-)oo) 

i=1 '0 (4. a) 

• Random Forcing Inputs: 

Np 

J(tf) = X w Pi E a [z T (tf)Q i z(tf)+u T (tf)R 1 u(t f )] < J(tf-*») 

i=i n (4.b) 

or 

Np /tf 

J(tf) = f X w Pi e2at E[z T (t)Q i z(t)+u T (t)R i u(t)]dt <; J(tf-><x>) 

f i=1 JO (4.c) 

• For H“-norm: 


Np 

J(tf) = X W PS 
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Jo 


fz T (t)Q i z(t)+u T (t)R'u(t)]dt 


r 

Jo 


w* T (t)w’(t)dt 


(5) 


with w i (t)=w 0 i exp(jto 0 't) where w 0 ‘ and co 0 ' are respectively the direction vector and frequency of 
the "worst-case" inputs w(t) in the H°°-norm evaluation at the i* plant condition. 
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The formulation based on a finite-time horizon provides not only appropriate lower bounds to 
these exact norms, but also an indication on internal stability for disturbable and detectable systems. 
It is well-known that, if treated entirely in frequency domain, synthesis procedure that minimizes 
either the L 2 or L°°-norms of plant outputs would not necessarily guarantee closed-loop stability. 
With the proposed formulation, if an optimum solution exists from the minimization of J(tf) for 
sufficiently large tf, then closed-loop stability will be achieved for a detectable, disturbable and 
stabilizable system. 

Another design concern in multivariable controls is the effect of input directionality 61 upon the 
closed-loop performance. Sensitivity of closed-loop responses to command inputs in the presence of 
plant uncertainties is often neglected or not explicitly defined in design techniques such as LQ, LQG, 
LQG/LTR. The design procedure described herein is based on parameter optimization of design 
objective that incorporates directly responses to specific commands. In this manner, effects of input 
directionality are obviously captured in the design objective through appropriately chosen input 
directions and with the usage of multiple plant conditions. It is envisioned that the ill-conditioned 
problem 61 of multivariable controls would no longer be a design issue. 

TV. Design Tool Implementation (SANDY1 


Figure 1 shows a schematics of the CAD design tool SANDY. The design tool is innovative and 
will serve a useful medium for the introduction of multivariable control concepts to a vast number of 
traditional control designers. 



Figure 1 Schematics of the Design Tool Implementation SANDY 

A procedure is set up to link the design code SANDY with the optimization library NPSOL and 
any user-defined Fortran subroutines defining nonlinear inequality constraints for the control design 
problem. This capability provides great flexibility to incorporate any additional design specifications 
to the problem without affecting the core program. An executable code is then generated to run the 
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design optimization. This version of the executable code can be run repetitively without the need to 
relink when the designers switch between design conditions, alter the design parameters while 
keeping the same design constraints as defined in the user-defined Fortran subroutines. 

Characteristics of the disturbance model, weighting parameters in the design objective, selection 
of the controller design parameters and options in built-in design constraints are entered in one single 
data file. State matrices for the plant models and the controller model are defined in separate data 
files. Printout of the design results will be provided at the end of the program execution. Future 
development will include the generation of design data files for the plant models, optimized controller 
model and the respective closed-loop systems in compatible formats for the analysis packages such 
as Matlab, Matrixx and Ctrlc. 

V. Design Example 

Preliminary development of the above design algorithm for mixed H 2 and H~ -optimization has 
been applied to the disturbance rejection problem for a B767 aircraft (Figure 2). The design objective 
is to synthesize a low-order feedback controller that minimizes the aircraft normal acceleration n^t) 
responses to vertical gust turbulence w(t). The performance relates to both peak responses (i.e. 
worst-case) and mean-square responses to Dryden spectra. 


B-767 Longitudinal Aircraft Model: Weight = 184,000 lbs, Mach = 0.80, 

Altitude = 35,000 ft, c.g. Location = 0.18MAC 



Figure 2 Disturbance Rejection Design for a B-767 Aircraft 
Using Mixed H 2 and H°°-Optimization 
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Mixed H 2 and H°°- performance objective is given by 

f tf 2 

I [10nic g (t)+%(t)]dt 

J = Lim { Sup — + W 2 E[ 10nic g (tf)+52c(tf)] } 

tf^ 00 W(t) ftf 

I w 2 (t)dt 

io (6) 

where 8^(0 is the elevator control. State matrices of the design model are given in the appendix. 

The controller is set up to have output feedback (Figure 2) on pitch rate q(t) and a second-order 
lead-lag feedback compensation on the normal acceleration n^t). Figure 3 summarizes three 
controller designs illustrating trade-off achieved in mixed H 2 and H°°-norm optimization: 

(a) H ^-norm optimization: With W 2 =1.0 and W„=0.0, this design simply solves the minimiza- 
tion of the mean square responses to Dryden turbulence using the controller structure shown in 
Figure 2. 

(b) Mixed H 2 and H— norm optimization: With W 2 =1.0 and W^l.O, this design is a 
predominantly H°°-norm problem yielding results similar to those achieved with algorithms described 
in Reference 33. In this design the mean-square responses of n^t) is roughly 16 percent higher 
than the FP-norm optimized design (Case a) while the H°°-norm is reduced by 20 percent. To recover 
the performance of IP-optimized design (Case a), we simply increase the penalty W 2 on the H 2 - 
norm performance.The following improvement is achieved with small degradation in H°°-norm as 
seen in the next design case. 

(c) Improved mixed H2 and H--norm optimization: With W 2 =10.0 and W^l.O, this design 
provides proper balance between H 2 and H°°-norm performance. The resulting H 2 -norm is almost 
the same as that of the H 2 -optimized design (~ 1.3 percent higher) and the H°°-norm is about 2.4 
percent higher than that achieved in case (b). 



oo (Rad/Sec) 


Design Piramflcn 

W 2 =1.0 , W„=o.o 

H 2 -No«n = 0.011672 
H**-Norm = 0.026687 


W 2 =1.0 , W_=1.0 

H 2 -Noim = 0.01 3501 
H--Norm =0.021446 


W 2 = 10.0 f W«=1.0 
H^-Norm = 0.011825 
H--Noira= 0.021960 


Figure 3 Comparison Between Mxed H 2 and H“-Norm Opiniad Designs 
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Results of the H°°-norm optimization are similar to those achieved by state-feedback or full-order 
output feedback designs using methods described in Reference 33 as indicated in Figure 4. Advanta- 
ge of the current design approach is its simplicity and low order structure. 


o 

5 

2 


CO 



Design Parameters 

( Do yle&G lover) 
H 2 -Norm = 0.01247 
H~-Norm = 0.021 208 


W 2 = 10.0 , W„=1.0 

H 2 -Norm = 0.01 1825 
H~-Noim = 0.021960 


Frequency (0 (rad/sec) 

Figure 4 Comparison with Existing H ““-Optimization Method 


VI. Future Directions 

Recent work have provided several useful mathematical measures for robustness characterization 
of multivariable feedback control design, numerical algorithms for their "exact" calculation and their 
usage in robust control-law synthesis. Basically there are two kinds of robustness measures 
depending on whether they are defined based on frequency-domain (i.e. Nyquist stability) or time- 
domain (i.e. Lyapunov stability) criteria. 

Methods in frequency domain have made significant strive since the early work 12 initiated by 
Doyle. Analysis techniques to determine the p-measure for structured uncertainty are still emerging 
and are most likely computationally intensive 1318 ' 20 21 . Complexity of these algorithms is partly the 
result of the wide variety of possibilities in the modeling of the uncertainty block A (Figure 5). 



Figure 5 Robuet Control Synthesis Bated oo Wont-Cue Uncertainties of A(s) 


Generally, the more structure (i.e. information) one assigns to the uncertainty block A, the more 
difficult it is to determine the necessary and sufficient bounds on p-measure 33 . 
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One approach in robust control-law synthesis is to make use of recent methods for "exact" 
calculation of p. or related measures to define the worst-case uncertainty model, say A*, and 
incorporate this condition into a closed-loop model [I-A*M(s,K 0 (s))] for re-optimization of the 
controller K^s) embedded in the transfer matrix M(s). 

For robustness measures based on time-domain approach 11 , bounds on plant uncertainties 
(aA,aB,aC,aD) in the state matrices (A,B,C,D) are determined (Figure 6) providing sufficient 
conditions for closed-loop stability. These analysis procedures can be elaborated to obtain specific 
worst-case plant conditions. As before once established, these conditions can be implemented into 
the design procedure SANDY where one of the plant conditions represents the worst-case plant 
model from which appropriate design constraints for robustness can be defined. 



Figure 6 Robust Control Synthesis Based on Worst-Cue 
Plant Perturbation ( 


Future work will also examine theoretical development of these robust design algorithms and 
their numerical implementation. These methods will be applied to the synthesis of flight control 
problems (e.g. SAS, autopilots, ride quality control, modal suppression, etc...) that include 
robustness issues such as multiloop stability margins, plant parameter variation and unmodelled 
high-frequency dynamics. Specifically we address the set-up of objective function and relevant 
design constraints (e.g. closed-loop damping, handling qualities in terms of short-period frequency, 
overshoots, command and control bandwidths, stability margins, limited control activities, etc...) for 
the following flight control problems, 

(a) Stability Augmentation Systems: 

• Pitch augmentation system 

• Yaw damper design 

• Disturbance rejection: ride quality, load factor reduction 

• Structural mode stabilization: control of lightly damped structural modes 

(b) Command Augmentation Systems: 

• Integral Controls 

• Autopilots: airspeed, altitude, flight path control 

• Manual control with handling qualities 

• Target tracking 

As one might envision, the control synthesis depends strongly on the design objectives (e.g. 
inclusion of integral control, washout filters for decoupling in steady-state control, anti-aliasing 
filters, time delay, etc...) regardless of the methods used in the determination of feedback and 
feedforward control gains. One of our research goals is to identify components in the synthesis 
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model essential to the design problems stated in (a) and (b) based on common knowledge of the 
design requirements in each particular situation. 
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Appendix 

The following are state matrices of the synthesis model with states {u,a,q,9,x w ,6 e }, inputs 
{8ec,w} and outputs {q.njcg}. 


a =-1.6750e-02 

1 . 1214e-01 

2.8000e-04 

-5. 6083e-01 

1.6750e-02 

-2 . 4320e-02 

-1.6400e-02 

-7.7705e-01 

9. 9453e-01 

1. 4700e-03 

1. 6400e-02 

-6. 3390e-02 

-4.1670e-02 

-3. 6595e+00 

-9. 5443e-01 

0 

4 . 1670e-02 

-3. 6942e+00 

0 

0 

1.0000e+00 

0 

0 

0 

0 

0 

0 

0 

-4.4470e-01 

0 

0 

0 

0 

0 

0 

-1.5000e+01 

b = 0 

0 





0 

0 





0 

0 





0 

0 





0 

9. 4310e-01 





1.5000e+01 

0 





O 

II 

O 

0 

1.0000e+00 

0 

0 

0 

6. 9400e-03 

3 . 2795e-01 

2.3100e-03 

0 

-6. 9400e-03 

2. 6790e-02 


d = 0 0 

0 0 
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Abstract 


The algorithm’s objective is to efficiently solve Dynamic Linear Programs (DLP) by taking advantage of their 
special staircase structure. This algorithm constitutes a stepping stone to an improved algorithm for solving Dynamic 
Quadratic Programs, which, in turn, would make the nonlinear programming method of Successive Quadratic 
Programs more practical for solving trajectory optimization problems. The ultimate goal is to bring trajectory 
optimization solution speeds into the realm of real-time control. 

The algorithm exploits the staircase nature of the large constraint matrix of the equality-constrained DLPs 
encountered when solving inequality-constrained DLPs by an active set approach. A numerically-stable, staircase QL 
factorization of the staircase constraint matrix is earned out starting from its last rows and columns. The resulting 
recursion is like the time-varying Riccati equation from multi-stage LQR theory. The resulting factorization increases 
the efficiency of all of the typical LP solution operations over that of a dense matrix LP code. At the same ume 
numerical stability is ensured. The algorithm also takes advantage of dynamic programming ideas about the cost-to-go 
by relaxing active pseudo constraints in a backwards sweeping process. This further decreases the cost per update of 
the LP rank-1 updating procedure, although it may result in more changes of the active set than if pseudo constraints 
were relaxed in a non-stagewise fashion. The usual stability of "closed-loop" Linear/Quadratic optimally-controlled 
systems, if it carries over to striedy linear cost functions, implies that the savings due to reduced factor update effort 
may outweigh the cost of an increased number of updates. 

An aerospace example is presented in which a ground-to-ground rocket s distance is maximized. This example 
demonstrates the applicability of this class of algorithms to aerospace guidance. It also sheds light on the efficacy of 
the proposed pseudo constraint relaxation scheme. 


Introduction 


The objective of the present work is to develop and test a special-purpose algorithm for the solution of 
Dynamic Linear Programs (DLP). The general form of a DLP is as follows: 
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where the x, vectors constitute the decision vector time history, the c, vectors are linear cost coefficients, and the A„ 
and A a+l matrix blocks and the b, vectors define the linear problem constraints. The bracketed equality and inequality 
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signs in eq. 1c indicate that both forms of constraints may be present; some rows may be equalities while others are 
inequalities. The problem in eq. la-lc is also known as a staircase LP. 


A reason for interest in this problem from an aerospace controls point of view comes from its relationship to 


the following multi-stage trajectory optimization problem: 

r T T T T TT"1 T 

find: x = I u o x i Ui x 2 ... u N .! x N I (2a) 

N-l 

to minimize: J = ^ L(x k ,u k ,k) + <J>(x N ) (2b) 

k=0 

subject to: x 0 given (2c) 

X k+ , = f(x k ,u k Jc) fork = 0 ... N-l (2d) 

a(x k ,u k Jc) | ~ J 0 for k = 0 ... N-l (2e) 

a (* N M { < } 0 (2f) 


where the u ; and x i+1 vectors constitute the control and state vector time histories, the LO and <K) functions define the 
nonlinear stagewise and terminal costs, eq. 2c defines the initial conditions, eq. 2d is the discrete-time dynamics 
difference equation, and constraints such as 2e and 2f may be present to restrict the states, or the control inputs, or 
both. 


The current paper is part of a research program that seeks a fast and reliable way to solve the problem in eq. 2a- 
2f. The ultimate goal is to do real-time aerospace guidance by repeatedly solving this problem. The program is 
taking a two-pronged approach, algorithm improvement and parallelization of computations. This paper relates to the 
first prong, algorithm improvement. Fletcher’s Lj penalty function, trust region adaptation of the method of 
successive quadratic programs (SQP) [1] is one algorithm for solving such a nonlinear program (NP). This algorithm 
has fast local convergence properties, it ensures global convergence (to a local minimum), and it is good at handling 
inequality constraints. The application of this algorithm to the nonlinear trajectory optimization problem results in 
Quadratic Programming (QP) sub-problems with a special structure, the dynamic programming structure. Efficient 
solution of the NP requires efficient solution of the dynamic QP (DQP). Special-purpose algorithms for efficient 
solution of a DLP can be similar to special-purpose algorithms for efficient solution of a DQP. Thus, the present 
paper, in concentrating on DLP, constitutes a sort of warm-up exercise for later development of a DQP algorithm. 

In addition to providing a warm-up, the present work provides a point of comparison with other research efforts 
in the field. Little or no work has been done on special-purpose DQP algorithms [23], but much attention and effort 
has been devoted to special-purpose DLP algorithms (e.g., Refs. 4-7). Fourer provides a useful overview of different 
avenues of approach that have been tried [7]. He groups algorithms for problem la-lc into three categories: Compact 
Basis, Nested Decomposition, and Transformation. All get the correct answer, but none have proved particularly 
successful in that none consistently out-perform the general sparse simplex method with regard to computation time. 

The present algorithm is in the compact basis category; it works with a staircase factorization of the active 
constraints. The factorization used, a staircase QL factorization, is consistent with the plan for subsequent upgrading 
to handle the quadratic cost case. It has numerical stability, and there is no trade-off between numerical stability and 
factor compactness; general sparse matrix LP codes must deal with such trade-offs. The focus of the entire project is 
on aerospace guidance problems, hence the submatrices of the problem, the A a and A a+1 blocks, are relatively dense. 
Therefore, there is hope that the current algorithm will out-perform general sparse matrix LP algorithms on these 
problems (e.g., algorithms such as MINOS [8]). 

The body of this paper concentrates on explanation of the algorithm, with an example and conclusions at the 
end. Before describing the algorithm, problem la-lc is related to a general LP on the one hand and to a control-type 
LP on the other hand. The algorithm description begins with a review of the application of the Lj penalty function 
method to a general LP. The staircase QL factorization then gets presented along with methods for multiplier 
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solution, decision vector solution, and rank-1 update. The algorithm explanation concludes with the presentation of a 
specialized order for problem solution that could further reduce the computational burden. The example at the end of 
the paper demonstrates the algorithm's applicability to aerospace trajectory optimization and examines whether the 
special solution ordering yields increased computational efficiency. 

Equivalent Problem Forms 

The problem in eq. la-lc is a special case of the following general LP form: 


find: 

X 

(3a) 

to minimize: 

J = C T X 

(3b) 

subject to: 

A x - b j^J 0 

(3c) 

where x is defined in 

eq. la and A, b, and c are defined as: 
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Form 3a-3c is fully general. It is the LP form used in developing general active constraint algorithms [1]. 
A more specialized DLP problem statement clarifies the relationship of these problems to controls: 


find: 

u k for k = 0 

... N-l and x k 

for k = 1 ... N 

(5a) 
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where there is a direct correspondence between eq. 2a-2f and eq. 5a-5f, all functions having only linear and constant 
terms in the latter problem. The following definitions put problem 5a-5f in the format of problem la-lc: 


*o = 


*k = 



Akk+i - 




0 O' 
-I 0 


x n = x n , A nn = A x , b N - b x , andCN-c x 

N N N 

Thus, the problem in eq. la-lc is related to controls. 


for k = 1 ... N-l 
for k = 0 ... N-l 


(6a) 

(6b) 

(6c) 
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Algorithm 


The L t Exact Penalty Function and an Active Set LP Method* 

The L! exact penalty function method enforces equality and inequality constraints as in 3c by adding a penalty 
cost to the problem cost that is a weighted L! norm of the constraint violations. The penalty function reformulation 
of problem 3a- 3c becomes 


find: x (7a) 

to minimize: J = c T x + p^JIIAe x - bjlj + UtA^ x - (7b) 


where p^* is a large positive constant, where constraints [A,b] in eq. 3c have been split into the equality constraints, 
[A e ,bJ, and the inequality constraints, [A^b^l, and where [a] + = max(a,0). This technique is actually very similar to 
the adjoining of constraints in Lagrange and Kuhn-Tucker formulations. The only difference is that the constraint 
multipliers are effectively limited in magnitude by p^*. If is greater than the magnitude of the largest multiplier 
in the solution of problem 3a-3c, then problem 7a- 7b has the same solution. The advantage of this technique is that it 
solves the problem in a single phase, optimizing while achieving feasibility. It is a variant of the LP technique 
known as the big M method. The primary reason for using it in the current paper is to make the algorithm 
compatible with the proposed method for eventually solving the NP in eq. 2a-2f. 

Active constraints refer to those constraints that are satisfied as strict equalities. The active set method for 
solving problem 7a,7b consists of the following steps. 


Main Algorithm : 

1. Guess solution, x, and split [A,b] into active and inactive rows: 



= P [A,b] 


( 8 ) 


where P is a permutation matrix, and where the active constraints at the guessed solution must 
yield an A ta that is square and nonsingular. 

2. Compute A*** 1 

3. Compute u** = - (A,j*) T (c + A T inf ^4i nr ^) where the elements of Uu, act are either -p™*, 0, or +Pm ax 


depending on whether the corresponding inactive constraint is an equality or an inequality and 
depending on whether it is positive or negative at the guessed solution, x. 

4. Find the element, i, of that is furthest out of the allowable range for the penalty function, 

(U*ct)‘ ^ fo r equality constraints, (p^Ji € [0, +p mt J for inequality constraints. Stop 

if no elements are out of range; the current x is optimum. 

5. Compute the search direction, Sx = A^*i. where ei is the unit vector with all zeros except for a 1 


in row i. 

6. Compute the new guessed solution, x = x + a Sx, where a has the same sign as (p,^ and where 
its magnitude is chosen to be the smallest value that makes one of the inactive constraints active, 
min loti such that [A^^x-i- a Sx) - b m9Cl ] } = 0, for some row j. 


t The discussion of this section deals with the problem form in eq. 3a-3c. The discussion is based 
on algorithms presented in Ref. 1. 
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7. Interchange rows i and j between [A^.b^] and [Aj aiet ,b inK J, update A ict , and go to step 3. 

The equation in step 3 results from differentiation with respect to x of the augmented cost in eq. 7b. Despite the 
nondifferentiability of eq. 7b for some values of x, differentiation can be done if the active constraints are treated as 
adjoined equality constraints with unknown multipliers u,^ rather than as L, penalty terms. Steps 4 and 5 determine a 
descent direction. The step length is determined in step 6. The step length rule ensures that the entire step is in a 
descent direction of the piecewise-linear augmented cost. Because of the step length rule, the new active set changes 
by only one row. Step 7 takes advantage of this fact in the recomputation of A IC1 s inverse. 

The algorithm used in this paper starts out by assuming that a set of pseudo constraints are active. This yields 
the identity matrix for the initial A. cl , and step 2 is trivial. The allowable range for the pseudo constraint multipliers 

is different than for the actual problem constraints, 6 [0,0]. They get dropped from the active set in the course 

of the algorithm unless the problem solution is not unique. 

For each constraint addition/deletion the algorithm cycles through steps 3-7. Most of the computational load 
per cycle is caused by manipulations with the inverse of the A tcl matrix, multiplication by it in steps 3 and 5 and 
rank-1 updating of it in step 7. The main idea of this paper — indeed the main idea of all compact basis schemes — is 
to compactly represent factors of A, cl that are suitable for carrying out multiplications by A lct ’s inverse and that are 

easy to update when A lct undergoes a rank-1 row change. 

Staircase QL Factorization for Staircase LP 

A, ct inherits a staircase structure from A as in eq. 4. The compact QL factorization used here performs a stage- 
wise backwards sweep to factor the nonzero blocks of A act . The following recursion yields matrices that constitute a 
staircase QL factorization of A tct : 


Dnn = Ann 


■ A k-u-i >cl A k-ik tct 


Dk-u-i 

0 ' 

0 Du 


Dkk-i 

L kk _ 


for k = N, ... ,1 


QooDoo — Loo 


(9a) 

(9b) 

(9c) 


where the A k . lk _ ltc[ and A k _ )ktcl matrices are the nonzero blocks of the staircase A lct matrix, and where the Qu, are 
orthonormal and the L tt are lower triangular. The QL factorization is stored in the matrices Q^, L tt , and for k = 
N.....0 and the matrices for k = N.....1. The matrices have no special properties. Equations 9b and 9c are 
only implicit relations for these factors, but the factors can be explicitly evaluated via Householder transformations. 
At stage k, the factorization begins with the data A k .i k _ 1(lct , A k . lkltt , and D kk , and it computes Qkk- L^, D k . lk _i, and 

Du,.!. The result D k . lk _i then completes the necessary data for stage k-1. 

Numerical stability of the factorization is ensured by the use of orthogonal transformations only. The 
computational complexity of the factorization algorithm is linear in the number of stages and cubic in the 
dimension(s) of the blocks, which is as efficient as can be expected if the blocks are dense. The factor storage is linear 
in the number of stages and quadratic in the dimension(s) of the blocks, which again is the best that can be achieved 
with dense blocks. 

Equations 9a-9b are a DLP equivalent to the matrix Riccati equation of time-varying, multi-stage Linear 
Quadratic Regulator theory. The lower blocks of the right hand side of eq. 9b act as a closed-loop dynamic difference 
equation as will be shown in the next section. The upper block on the right hand side, [Dk-ik-i> propagates active 

constraint effects backwards in time; it summarizes the constraints that x k .i must satisfy in order to make possible the 
satisfaction of all constraints from stage k-1 onwards. Note that the number of rows in [D k . lk .i, 0] does not 
necessarily equal the number of rows in [A k _i k .i, cl> A k . lkact ]. 
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Staircase QL Solution for the Multiplier and Decision Vector Time Histories 


The basic operations involved to do steps 3 and 5 of the LP algorithm presented above involve solution of a 
linear system by orthogonal transformation and forward or backward substitution. This is similar to the forward and 
backward substitutions of the simplex method and its variants. The transformations and substitutions are done in 
stage-wise recursions using the stagewise factors. Performing the following backward recursion then forward recursion 
yields the active constraint multipliers. 


Backward recursion for intermediate multipliers through 2^ 

T T T 

LnN = - C N - ANN inlct UN iotct - AN-lNj MC( UN-lj Mct 

Lkk Ak = - D k+ lk Vm - C k - Au wU'w ■ Ak lk in.cA-l iMC , 

T T T 

Looio = - Dio^i -c 0 - Aoo^ ct Uo i __ 


for k = N-l, .... 1 


(10a) 

(10b) 

(10c) 


Forward recursion for intermediate multipliers JJo through and for active constraint multipliers Uq through u N : 


i-Qoo&o 


Xa 

T 

' & ' 

-fik+1 _ 

= Qk+lk+1 

_ ^W+i _ 


for k = 0 ... N-l 


(Ha) 


(Hb) 


Un. ~ 8 n 


(He) 


Step 5 of the LP algorithm is accomplished by solving the system of equations A tct 8x = e ; . To illustrate how 
the staircase QL factorization does this, the following equations present its use in solving the alternate system A, c ,x 
= b act . Again, a backward recursion followed by a forward recursion yields the solution. 


Backward recursion for intermediate nonhomogeneous constraint terms d N through d 0 and g N through g 0 : 



Bo - Qoo d 0 


for k = N, .... 1 


(12a) 

(12b) 

(12c) 


Forward recursion for the decision vectors x 0 through x^: 


L 0 x 0 = g 0 (13a) 

Lkk x k = - D kk .iX k . 1 + g k for k = 1 ... N (13b) 


As stated earlier, eq. 13b is like the closed-loop dynamic difference equation of multi-stage LQR theory. The matrix 
is lower triangular and allows for easy solution for x^ in terms of x k A and g k . 

Rank-1 Update of Staircase QL Factorization 

This section explains how to efficiently update the staircase QL factorization after a single constraint 
addition/deletion. This procedure must be carried out every time step 7 of the main algorithm is encountered. One 
could recompute the entire factorization, but the practicality of all LP codes hinges on their ability to update the 
factors for much less work than would be required to recompute them from scratch. 
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The general add/drop updating scheme for the staircase QL factorization must update the results of eq. 9a-9b 
when an arbitrary row j at stage kadd gets added to the active constraint set and another arbitrary row i at stage kdrop 
gets deleted from the active constraint set. Thus, [A ttia , A kk+lact ]l k , k ,dd gets a new row and [A^,, Au+^JI^drop 
loses a row. The stages kadd and kdrop can have any relationship to each other, and the update algorithm must be able 
to handle all possible cases. Three different cases can occur, kadd > kdrop, kadd = kdrop, and kadd < kdrop. 

Efficient rank-1 update can be accomplished by a series of stage-wise rank-1 updates linked together in an 
appropriate manner. Three different stagewise rank-1 updating algorithms are needed to do this. The first algorithm 
updates the factors computed in eq. 9b when a new row has been added to the bracketed expression on the left-hand side 
of that equation. The second algorithm updates these same factors in the case of a row deletion from the bracketed 
expression on the left-hand side. This second algorithm also modifies Qnn by a single Householder transformation. 
The third stagewise algorithm updates these same factors when has undergone an arbitrary rank-1 change. Recall 
that the bracketed matrix on the left-hand side of eq. 9b represents the input data for a given stage and the Qu, matrix 
together with the bracketed expression on the right-hand side represents the result of the stagewise factorization. The 
following discussion explains each of these three algorithms and the way in which they work together to accomplish 
the multi-stage rank-1 update. 

First, consider what happens to the stage k factorization when a new row gets added to either [A k .i k . Ucl , A k . lkacl ] 
or Du,. The algorithm begins by adding a row and a column to Qu, with all Os except for a 1 at the intersection of the 
new row and the new column. Thus, Qu, remains orthonormal. Suppose the new constraint row is [p T kl , p T k 2 ], 1,1011 
the new row and column of Qu, are added so that eq. 9b temporarily becomes: 


Qkkil 

0 

Qldci2 

A k . i k _i 

* aci 

Ak-ik, c , 


Dk-lk-1 

0 

0 T 

l 

0 T 

P T kl 

P T k 2 


P T ki 

P T k2 

_ Qkk21 

0 

Qkk 22 _ 

0 

Dkk _ 


_ Dkk-1 

Lkk _ 


where the Q kkij matrix blocks are just the blocks of the original Q kk matrix. Suppose n k is the dimension of the x k 
decision vector. Then it is also the dimension of the square lower-triangular matrix Lu,. A series of n k Givens 
rotations can be performed to zero out p T k2 while preserving the lower-triangular structure of L^. The first Givens 
rotation uses the last row of l kk as the pivot row and zeros out the last element of p\ 2 , and successive rotations use 
successively higher rows of L kk as the pivot and zero out successive elements of p T k2 going from right to left. 
Suppose these rotations are G! to G nk . Then the new stage k factorization becomes: 


Qkknew “ ®”k' 



Qkkll 
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Qkki2 

*Gj* 

0 T 
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Qkk22 _ 
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D k -ik i 
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cT kl 
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= Gn k * ...•G 1 * 

P T ki 

P T k 2 

D kk _i 

L “ ‘new 
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_ Dkk-l 

Lkk . 


D k-lk->ncw 



(15a) 


(15b) 


(15c) 


where the last equation has been included to emphasize the fact that the new D k . lk l differs from the old D k . lk .i by only 
a single new row. This fact sets the stage for the use of this same algorithm at stage k-1. The new Q kk is 
orthonormal because the augmented matrix is orthonormal and because all Givens rotations are orthonormal. Thus, 
the new factors have all of the required properties for use in the LP algorithm described above. 
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Next, consider what happens to the stage k factorization when a row gets deleted from either [A k _ lk _ lact , A k _ lkact ] 
or Djo,.. The following development is based on ideas for QP from Ref. 9. Write in the form 


Qkk - 


Qkkn Qkk 12 Qkki3 

T T 

Q kk21 Qkk 22 Q kk23 

L Qkk31 0kk32 Qkk33 


(16) 


where the middle column conforms in matrix multiplication with the constraint row that is getting deleted - rows of 
Dkk can be referred to as constraints; they are propagated active constraints. The bottom blocks, [Qkk 3 i>Qkk 32 >Qkk 33 L 
have n k rows, the same as in the bottom blocks on the right hand side of eq. 9b, 

The stagewise deletion algorithm starts with a Householder transformation in which the q^^ row in the above 
representation is used as the pivot row to zero out in the first rows. Next, a series of n k Givens rotations is used 
to zero out successive elements of q^j starting with the topmost element and working downwards. Again, the (\& 22 

row in the above representation is used as the pivot. If the Householder transformation is H and the Givens rotations 
are Gi to G nk , then the following changes to the stage k factorization result: 


Qkkltnew 0 Okkl 2 new 
0 T 1 0 T 
Qkk21new ® Okk22new J 


Qkkn Qkk 12 Qkki3 
= G nk # ...*G 1 *H«j Q T kk21 Qkk 22 Q^kk23 

L Qkk3i 0kk32 Qkk33 


Qkkn 


r Qkkn 

L Qkk21, 


Uncw 1 2 n ew 

Qkk 22 new 


D, 


k-lk-l n 


P T kl 


0 

P T k2 


D 


kk-l„ 


f I^k-lk-l 0 1 

G nk -...-G 1 -H| Dkk Lkk J 


(17a) 


(17b) 


(17c) 


where [p T k 1 ,p\ 2 ] corresponds to the constraint that is getting dropped. Orthonormality of the original Q kk matrix 
ensures the form of the result on the left-hand side of eq. 17a. Note that the matrix D k _ lk . lnew is a function only of 
D k .ik-i and H; the Givens rotations do not affect it. Therefore, another Householder transformation, H", can be 
constructed based on the same Householder vector. It yields: 


fc] 


(18) 


where d k _ ldrop is not necessarily equal to p kl . This sets the stage for propagation of the constraint deletion process 
backwards to stage k-1. If gets transformed according to 


Qk-lk 


1 interim 


= Qk 


lk l [o H"] 


(19) 


then Qk- ik- 1 interim is still orthonormal because H” is orthonormal, and because H" is equal to its transpose, constraint 
[0 T ,d T k ldrop J is the constraint that must get dropped at stage k-1. The foregoing algorithm can accomplish the deletion 
at this next preceding stage. 
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The algorithm that performs the multi-stage rank-1 update of the staircase QL factorization starts with the 
highest stage at which either a constraint addition or deletion occurs. It uses whichever of the two foregoing stagewise 
updating algorithms is appropriate to propagate the addition or deletion backwards. It continues until it reaches a stage 
at which both an addition and a deletion must take place. One, but not both, of the changes at this stage may be the 
result of a backwards propagation. At this stage of the concurrent add/drop, the multi-stage algorithm first does a 
single-stage constraint addition followed by a single-stage constraint deletion with no change of stage in between. 

If the index of this stage is k, then D klkl will differ from its pre-update value by a rank-1 change at most. 
This can be shown by recognizing that the result on the left-hand side of eq. 15c becomes the input data on the right- 
hand side of eq. 18 when an add followed by a drop both occur at the same stage: 


^k-lk-l ncw 



( 20 ) 


H" is a Householder transformation; it differs from the identity matrix by a rank-1 matrix, hence the conclusion about 
the change in D klkl . Define this rank-1 change in terms of the vectors r k _! and s^: 

Dk-lk-l ncw = Dk-lk-1 + r k-l sT k-l @1) 


If either r k l or is the 0 vector, then the multi-stage rank-1 update is complete. If not, then another stagewise 
updating algorithm is needed. 

The final stagewise updating algorithm must update the stagewise factors for an arbitrary rank-1 change in the 
data D^. It is allowed to produce at most a rank-1 change in D k _ lk _ x . This restriction on its effect on D k . lk .j makes it 
self recursive for all subsequent stagewise factorizations in the backwards chain. It can be used for updating the 
factorizations of all stages that precede the concurrent constraint addition/deletion stage. It can be used recursively 
until no more updating is needed. 

One might suppose that the necessary algorithm has already been developed in a work such as Ref. 10. That 
paper is a good reference for rank-1 modifications, and it defines the general methodology used in the algorithm below, 
but the relevant algorithm from [10] would result in a rank-2 change to D k _ lk _!. This would destroy the stagewise 
recursive applicability of the algorithm, hence the modified algorithm presented below. 


Suppose there has been a rank-1 modification to as in eq. 21 (except at stage k instead of stage k-1). Then, 
eq. 9b gets modified: 


QkkJ 

where 


H-lk-l 


H-lk 


o [Dkk+ r k sT k] 




( 22 ) 



(23) 


and where v k _! and D k . lk .j have the same number of rows, n^. The algorithm starts by reducing the v-w vector to a 

vector with zeros in all of its entries except the last two. This is done by first applying a Householder transformation, 
H,, to the first n^+l rows to zero out the first n dk _j rows. Then a series of Givens rotations, G x to G nk _i, is applied 

to successive pairs of rows of the resulting vector to zero out successive elements until only the last two elements are 
left nonzero. These same transformations are applied to all terms on both sides of eq. 22, and the two terms on the 
right hand side of the equation are added together with the following (partial) result: 
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(24) 


G 


nk-1 


HI-1H 



av k _! 0 0 ... 0 

* * 

* * * 0 

* * * * 


* * * * * 

* * * * * J 


where a is a scalar and where asterisks (*) indicate nonzero scalar elements of the matrix. The top row of vector 
entries in the right hand matrix corresponds to the upper right-hand 0 block in the bracketed expression on the right 
side of eq. 9b. The bottom rows constitute a matrix with nonzero elements on the first diagonal above the main, on 
the main diagonal, and below the main diagonal. All elements on diagonals that are 2 or more above the main 
diagonal are zero. The nonzero entry in the first column of the top block, av^, results from application of the Hi 
Householder transformation to matrix [0 T ,L T kk ] T . 


The remaining transformations are applied in order to restore lower triangularity to the matrix on the right hand 
side of eq. 24. First, a series of n k -l Givens rotations, G nk to is applied to successive pairs of rows of the 

matrix starting from the last two rows and working up to the first two rows in the lower block. Each rotation zeros 
out one of the above-diagonal elements. At the end of this operation the matrix has the form 


— ctv k _i 0 0 ... 0 “ 

* 

* * 0 

* * * 

♦ * * * 

— * ****-. 


(25) 


so that the lower block is lower triangular. The final part of the algorithm is to apply a last Householder 
transformation, H 2 , to zero out the first column of the top block. These operations result in the following factor 

updates: 


Qkknew = H 2 »G 2nk - 2 *...*G 1 *H,*Q kk 


(26a) 


Duklnew 0 

®kk-l new ^-'kk new _ 


= 112-G^ 


-2»...*G 1 *H 1 *| 


Dk-lk-1 

Dkk-i 



1} 


(26b) 


In both of the Householder transformations, the first n dkl elements of the transformation vector are parallel to v k .], and 
none of the Givens rotations affect the first n^.j rows of the bracketed expression on the right of eq. 26b. Therefore, 
Dk i k i new differs from D k lk l only by a rank- 1 matrix: 


Dk-lk-l^ = Dk-lk-1 + v k-iy T k-l (22) 

where the vector y k .j can be determined from the algorithm presented above. Thus, the algorithm updates stage k 
according to the rank-1 change in the stage's input data, and it produces a similar rank-1 change in the input data for 
stage k-1. The multi-stage rank-1 updating algorithm propagates these rank-1 changes backwards until at some stage 
one or both of the vectors in the rank-1 change are zero. This occurs at least by the time stage k = 0 is reached 
because D.j.j has zero dimension. 
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Numerical stability of the factorization update is ensured by the use of orthogonal transformations only. The 
computational complexity of the multi-stage update algorithm is linear in the number of stages affected and quadratic 
in the dimension(s) of the blocks, which is as efficient as can be expected if the blocks are dense. If the number of 
stages affected by a particular row interchange of active and inactive constraints can kept small, then the cost of the 
update will be small. This fact provides the motivation for the solution scheme presented in a later section. 

Possible Improvements to Banded Staircase QL Factorization 

Several issues come to mind in considering the forgoing use of a QL factorization for an LP basis factorization. 
They all revolve around a single question: is the entire factorization needed to implement the LP algorithm? For 
instance, a general LP method has been developed that uses LQ factorization but does not store Q [11]. Not storing 
the Q factors would yield a great savings in memory and computation time if it carried over to the present multi-stage 
algorithm. This presents no difficulty to the procedures for solving for the multiplier and decision vectors, steps 3 and 
5 of the main LP algorithm. The problem with not storing Q occurs in the factor update, step 7. There is no 
apparent way to do the single-stage constraint deletion or the single-stage rank-1 modification without storing at least 
some of the matrix. Reference 9 has some ideas in its section on quadratic programming that could be used to 
eliminate storage of the lower part of Q^. Alternatively, storage of and D^-i could be eliminated. Savings in 
computation time and memory would be about the same for either scheme, about 30% savings. These issues may be 
explored in a later work. 

Backwards-Sweeping Pseudo Constraint Relaxation and an Alternate Method of Selecting the 
Active Constraint to Drop 

In theory, all dynamic programming problems can be solved by first computing the cost-to-go at each stage, 
then solving a single stage optimization at each stage. Part of the cost for each of these single stage problems is the 
cost-to-go that results from the stage’s decisions. For DLPs and for their associated Li penalty function problems, the 
cost-to-go at a given stage is a piecewise-linear convex function of the decisions at that stage. This convexity 
property gives rise to a hope that DLPs may have a property like the stability property of their quadratic-cost 
counterparts, multi-stage LQR problems. In the DLP context, this property might mean that a small change in the 
decisions at a given stage would give rise to even smaller changes in the state at subsequent stages. This might 
translate into a grouping of constraint additions and deletions at stages nearly following the stage at which the decision 
variations are taking place. 

If this property exists, it can be exploited without the necessity of computing the entire cost-to-go function. If 
all of the active constraint multipliers for constraints following a given stage are within their allowable range, then the 
guessed solution is an optimal trajectory for all stages following that stage. Also, the local linear piece of the cost-to- 
go function is known. Suppose the given stage can be optimized without causing any of the multipliers at subsequent 
stages to exceed their penalty function bounds. Suppose also that all of the original pseudo constraints are active 
for the preceding stages. Then, the rank-1 updates that would have to be done during the optimization of that stage 
might involve changes to very few stages. The assumption about the the pseudo constraints ensures that the updates 
will not affect any of the stages preceding stage k-1 if stage k is being optimized. The possibility of stability implies 
that max(kadd,kdrop) might, in most cases, not be much larger than k. 

A change is needed to the main LP procedure presented above. It allows the multipliers at stages subsequent to 
the stage being optimized to vary outside of their penalty function bounds. The modification needs to be in the 
selection of the active constraint that gets dropped on each cycle. In the main algorithm, the dropped constraint is the 
same as the non-optimal constraint that gets relaxed in steps 4-6. This could cause an active constraint multiplier that 
was within its bounds to go out of its bounds. If the multiplier corresponded to a constraint at a subsequent stage, 
then the optimality of the subsequent stages would break down. 

This situation can be avoided by performing a search in the active constraint multiplier space for the active 
constraint to be dropped. This search is the dual of that carried out in steps 5 and 6 of the main algorithm, and die 
search direction is defined by relaxing the Li penalty function constraint on the multiplier associated with inactive 
constraint j, the inactive constraint that is becoming active. The size of the step in multiplier space is chosen to be 
the smallest that brings one of the active constraint multipliers to a bound which the multiplier would violate if the 
step size were larger; the new active constraint must be included in this test. The active constraint whose multiplier 
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bound limits this step size is the active constraint that gets dropped. It is not necessarily the constraint whose 
relaxation was dictated in step 3 of the main algorithm. 

With this scheme in place, only pseudo constraints will have multipliers that are out of bounds in the step 3 
optimality test. The number of non-optimal active constraints will never increase. In turn, each pseudo constraint 
will eventually be the constraint that gets chosen for dropping, though this may happen while another pseudo 
constraint is being relaxed. 

A special order has been chosen for relaxing pseudo constraints to take advantage of the possibility of savings 
from "stability". The modified main algorithm starts by testing and relaxing only stage-N pseudo constraints in steps 
3-6. This continues until all of the stage-N pseudo constraints have dropped from the active list or have zero 
multipliers. Then the algorithm switches to exclusive consideration of the stage-(N-l) pseudo constraints in steps 3-6. 
It performs add/diop cycles until all of these pseudo constraints get dropped or have zero multipliers. It continues this 
stagewise pseudo constraint relaxation scheme in a backwards sweep all the way to stage 0. The guessed solution is 
optimal after the last stage-0 pseudo constraint has been dropped or has had its multiplier go to zero. The trajectory 
from stage k to stage N is an optimal trajectory once all of the stage-k pseudo constraints have been dropped or have 
had their multipliers go to zero (although it probably will not be the final optimal trajectory associated with the 
solution to the overall problem). 

Comparison of Algorithm Complexity with Matrix Riccati Equation 

Table 1 compares the present algorithm’s computational complexity with that of related algorithms for a typical 
aerospace controls problem. The time- varying multi-stage Matrix Riccati equation actually does not compute an A act 

because it solves a different optimization problem. It has been included because control engineers are more familiar 
with it. The three QL factorization entries assume that the factors are built up from initial pseudo constraints via 900 
rank-1 updates. In the last two entries, assumptions are made about the average number of stages affected per rank-1 
update. The table clearly indicates that the staircase QL factorization makes a tremendous improvement in comparison 
to the dense factorization; the improvement will not be nearly so great in comparison to a general sparse matrix code. 
Also, large improvements are expected from the special ordering of the pseudo constraint relaxation. Note that all of 
the algorithms are far more costly than the implementation of a time-varying LQR solution. Inequality constraints are 
difficult to handle. 


Table 1. 

A Comparison of Effort for Factorization of A act for a Typical Aerospace Control Example 
(100 stages, 6 state vector elements, 3 control vector elements) 


Solution Method Effort 

(No. of Mult, Div., & Sqrt) 


Matrix Riccati Equation 1 12,000 

Dense Matrix QL, Not storing Q 2,920,000,000 

Staircase QL, Arbitrary order of pseudo constraint relaxation 90,700,000 

Staircase QL, Special order of pseudo constraint relaxation 4,400,000 


Aerospace Example 

A simple aerospace control problem has been solved with the algorithm in order to demonstrate the usefulness 
of this class of algorithms on aerospace problems and in order to study the algorithm's behavior. The problem is one 
of fixed-time maximization of the distance travelled by a thrust- and impulse-limited ground-to-ground rocket. The 
continuous-time problem is: 


find: 

u(t) for 0 < t < tf = 12 sec 

(28a) 

to minimize: 

J = -[1000 ]x(tf) 

(28b) 

subject to: 

x(0) = 0 

(28c) 
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X + 


u + 


(28d) 


“0 100 
dx 0 0 0 0 

dT = 000 1 

-0000 


0 0 
32.2 0 
0 0 
0 32.2 


llu(t)ll 2 < 5 g 


J 


Hu(x)ll 2 dx< lOg-sec 



(28e) 

( 28 f) 


-[001 0]x(t) < Oft. (28h) 

which is a point-mass model of motion in the vertical plane. The acceleration (thrust) limit is 5 gs and the impulse 
limit is 10 g-sec. The first two state vector elements are horizontal position and velocity; the last two elements are 
vertical position and velocity. Only thrust and a uniform gravity field act on the rocket. The first control vector 
element is horizontal acceleration; the second element is vertical acceleration. Constraint 28h keeps the rocket above 
the ground. 

In order solve this problem with this paper’s algorithm, the control time history has been approximated by a 
24-stage zero-order hold. Additionally, the norms in constraints 28e and 28f have been approximated by functions 
with octagonally-shaped contours. The 2-norm's contours are spherical; so, this approximation introduces some 
modelling error. Fixing the end time seems unnatural, but it is necessary in order to be able to model the problem as 
an LP. An NP model is needed to handle the free-end-time case. 

The LP code solved this problem in 55 min. on an IBM PC-AT with an 80287 coprocessor. It started from a 
first guess that violated inequality constraint 28h at every stage and that foolishly tried to maintain a constant thrust 
for the entire trajectory. Figures 1-3 compare the multi-stage LP solution with the exact continuous-time solution. 
In Fig. 1, the LP solution does better than the exact solution because of mis-modeling; it takes advantage of some 
extra thrust available at some points of the octagon norm. The thrust magnitude and angle time histories. Fig. 2 & 3, 
are both close to the exact solution, and the discrepancies are due to the same modeling error. 

Figure 4 gives a 2-dimensional histogram of the constraint addition/deletion frequency. The left-hand horizontal 
axis indicates the stage at which the pseudo constraints are being relaxed in the special backwards-chaining process. 
The right-hand horizontal axis indicates the stage at which constraint additions and deletions are occurring during that 
relaxation process. The vertical axis gives the frequency of additions/deletions at the given right-hand-axis stage during 
pseudo-constraint relaxation at the given left-hand-axis stage. The extreme left-hand side of the figure shows no 
constraint addition or deletions - none can occur at any stage before stage k-1 when the pseudo constraints at stage k 
are the ones being relaxed. The peaks on and near the center diagonal of the graph lend support to the conjecture that 
most of the constraint additions/deletions will happen at stages near the pseudo-constraint-relaxation stage. Note, 
however, that a moderate amount of constraint addition/deletion activity occurred near the terminal stage throughout 
the optimization. Nevertheless, the average factor update was relatively cheap. Altogether, about 800 rank-1 updates 
occur during the optimization. The total number of decision vectors in the time history is 312 - 9 extra states are 
needed to model the impulse constraint in eq. 28f. 

Conclusions 

An algorithm has been presented for solving Dynamic Linear Programs. It takes advantage of the staircase 
structure of the active constraint matrix by factorizing it into staircase QL factors. These are derived in a stagewise 
fashion and play a role similar to that played by the time-varying matrix Riccati equation in multi-stage LQR theory. 
All of the usual linear programming functions have been implemented with the staircase QL factorization: decision 
vector solution, multiplier solution, and rank-1 updating. Each function has a computational complexity of 0(n 2 N) or 
less, where n is a block dimension and N is the number of stages. This is the best that can be expected for dense 
blocks. Numerical stability is assured via the exclusive use of QL factors and is independent of pivoting strategies. 

The algorithm is a modified active set implementation of the big M method with pseudo-constraint 
initialization. The modification restricts the set of non -optimal constraints that can be relaxed at one time to a single 
stage. This restriction gets iterated through all the stages in a backwards chain. Also, the modification chooses the 
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constraint that gets dropped in a way that assures optimality of the final portion of the solution time history. The 
modified strategy’s goal is to reduce the average complexity of the rank-1 updates. 

A 24-stage example problem has been solved. The algorithm solves the 312-dimensional problem in about 800 
add/drop cycles, requiring 55 min. on an IBM PC-AT. The average update complexity is significantly reduced by the 
modified active set strategy. 
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Abstract 

A temporal finite element based on a mixed form of the Hamiltonian weak principle is presented for optimal 
control problems. The mixed form of this principle contains both states and costates as primary variables that 
are expanded in terms of elemental values and simple shape functions. Unlike other variational approaches to 
optimal control problems, however, time derivatives of the states and costates do not appear in the governing 
variational equation. Instead, the only quantities whose time derivatives appear therein are virtual states and 
virtual costates. Also noteworthy among characteristics of the finite element formulation is the fact that in the 
algebraic equations which contain costates, they appear linearly. Thus, the remaining equations can be solved 
iteratively without initial guesses for the costates; this reduces the size of the problem by about a factor of two. 
Numerical results are presented herein for an elementary trajectory optimization problem which show very good 
agreement with the exact solution along with excellent computational efficiency and self-starting capability. The 
goad of this work is to evaluate the feasibility of this approach for read-time guidance applications. To this end, a 
simplified two-stage, four-state model for an advanced launch vehicle application is presented which is suitable 
for finite element solution. 


Introduction 

Future space transportation and deployment needs are critically dependent on the development of reliable 
and economical launch vehicles that will provide flexible, routine access to orbit. A particular requirement now 
receiving attention is that for an advanced technology heavy-lift vehicle. Future space transportation Systems 
will need to place large payloads - 100,000 to 150,000 pounds - into low Earth orbit at an order of magnitude 
lower cost per pound. Such systems will also require on-board algorithms that maximize system performance 
as measured by autonomy, mission flexibility, in-flight adaptability, reliability, accuracy and payload capability. 
They must be computationally efficient, robust, self-starting, and capable of functioning independently of ground 
control. Also, the algorithms must be designed with the anticipation that the launch vehicle will undergo 
evolutionary growth [1]. 

One approach to optimal guidance consists of repeatedly solving a two-point boundary- value problem that 
results from the traditional necessary conditions for optimality in an optimal control problem formulation. The 
vehicle state at discrete instants of time along a trajectory can be viewed as a new starting condition, and the 
remainder of the trajectory is reoptimized for that condition. The open loop optimal control is applied for a 
short interval of time, and feedback is introduced by reoptimization at the next time instant. This process 
presupposes that the two-point boundary-value problem can be reliably solved in a time interval that is small 
compared to the control update interval. Knowledge of the previous solution helps in providing a good starting 
point for the optimization process; however, no method has been demonstrated that can operate reliably in a 
real time environment. 

This paper examines a finite element approach to addressing this problem. Hamilton’s principle has tra- 
ditionally been used in analytical mechanics as a method of obtaining the governing equations of motion for 
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continuous systems. In the 1970’s a form of Hamilton’s principle called Hamilton’s law of varying action was 
first used by Bailey (see, for example, [2]) to obtain direct solutions to dynamics problems in the time domain, 
thus introducing Hamilton’s principle into computational mechanics. During the last decade, it was shown by 
Borri ct ai [3] and by Peters and Izadpanah [4] that these direct methods, when expressed in a weak form, 
could be competitive with numerical solution of the corresponding ordinary differential equations. Later work 
by Borri ct al [5] has shown that a mixed form of the weak principle has further computational advantages, 
namely that shape functions can be chosen from a far less restrictive class of functions. 

In [6], Hodges and Bless have shown that optimal control problems can be solved in a virtually identical 
way to that of the mixed form of Hamilton’s weak principle. Hence, the method as used in [6] and the present 
paper has been called the weak Hamiltonian method for optimal control problems. Finite element methods 
have some advantages over other solution procedures; one advantage is that finite element methods provide 
the possibility for development of algorithms which converge reliably. The present method, at least for the 
problems investigated to date, is essentially self-starting. This meets a key operational requirement for on- 
board algorithms. However, application of the finite element method to optimal control problems is rather 
new. For example, Patten [7] used a Ritz-Galerkin technique with Lagragean interpolation polynomials. One 
advantage of the present formulation is the allowance for a simpler choice of shape functions. The computational 
savings which may stem from this are now under investigation. 

In this paper, a weak form governing optimal control problems is derived, and a finite element procedure is 
outlined for the solution of such problems. Numerical results for the solution of a simple trajectory optimization 
problem are presented and compared with the exact solution to demonstrate the accuracy and efficiency of 
the weak Hamiltonian finite element formulation. In anticipation of applying the present method to optimal 
guidance of a rocket booster, a simplified twostage model suitable for this problem is presented. In [6] a 
one-stage model was analyzed and finite element results were compared to a numerical solution obtained using 
a multiple shooting method [8]. Of particular interest are the self-starting operation and the performance in 
terms of execution time and accuracy versus the number of elements used to represent the time span of the 
trajectory. 


Weak Principle for Optimal Control 

A definite analogy exists between the mixed formulation of Hamilton’s weak principle in dynamics and the 
first variation of the performance index in optimal control theory. Specifically, there is an analogy between the 
generalized coordinates and generalized momenta in dynamics and the states and co-states in optimal control 
theory. Only a brief development of the weak Hamiltonian method for optimal control problems is presented 
herein. More details on the development and the analogy with dynamics problems may be found in [6]. 


General Development 

We start with a performance index taken from Eq. (2.8.4) of Bryson and Ho [9]. Its first variation will be 
taken in a standard manner, except that states, costates, and controls will have arbitrary variations. Rather 
than setting its first variation equal to zero, however, it will be set equal to an expression which contains the 
terms that are necessary to transform all boundary conditions to the natural or “weak” type. The final weak 
form is then obtained by integration of this equation by parts in such a way that no derivatives of states or 
costates appear. 

It should be noted that the fundamental relationships are not being changed. To make certain of this, 
we will ensure that the resulting formulation produces the Euler- Lagrange equations and boundary conditions 
which have already been established in optimal control theory (see, for example, [9], Eqs. 2.8.15 - 2.8.21). 

In order to clearly understand what is meant by a “weak” formulation and the derivation of the weak 
formulation that is to follow, we first study a more simple problem. Let us start with a functional of the form 

J = J F(y,y',x)dx (1) 

where A and B are fixed numbers. The necessary conditions for an extremal are defined by 
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( 2 ) 
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Introducing dF/dy' = / for notational convenience and integrating by parts we obtain 




Sydx + f6y 



(3) 


The integrand in the above equation is the familiar Euler- Lagrange equation. The trailing term leads to the 
boundary conditions. If y is specified at x — A or x = B, then by = 0 at x = A or x = B respectively. This 
is referred to as a strong boundary condition. If y is not specified at one of the endpoints, then / = 0 at that 
endpoint. This is referred to as a natural or weak boundary condition. The key points to remember are that 
the trailing term itself is zero at each endpoint and that specifying y requires that by = 0 at a point. 

In our weak formulation, we want all the boundary conditions to be of the weak type. Thus, even if 
y(A) = 0 then £j/(^4) / 0. To allow for this mathematically, we introduce a new variable / which represents the 
discrete value of / at an endpoint. The variation of J is now set equal to f6y\% yielding 


SJ 


=J A {^ S,+M ) d * =iSy 


This is referred to as a weak form. If we integrate by parts, we obtain 




6ydx = (f-f)Sy 


(4) 


(5) 


Note that the Euler-Lagrange equation is the same as before and that the two boundary conditions are that 
/ = / at the two endpoints; thus, the trailing terms are still constrained to be zero, but in a weak sense and by 
need not ever vanish. 

The advantages of the weak formulation are not apparent from the above discussion. However, when we 
apply this type of formulation to problems in dynamics (see, for example [6]), or optimal control theory [6], and 
use finite elements, then we have a powerful problem-solving tool. 

Consider a system defined by a set of n states x and a set of m controls u. Furthermore, let the system 
be governed by a set of state equations of the form x = /(x, u } t). We may denote elements of the performance 
index, J, with an integrand L(x,u,i) and a discrete function of the final states and time ),</]. In addition, 

any terminal constraints placed on the states may be placed in the set of q functions ] and adjoined to 

the performance index by a set of q discrete Lagrange multipliers v. Finally, we will adjoin the state equations 
to the performance index with a set of Lagrange multiplier functions A (t) which are referred to as costates. This 
yields a performance index of the form 


J = 



L - \ T f)dt — <p 


l«/ 



( 6 ) 


Taking the first variation of J and setting it equal to an expression chosen so that all boundary conditions are 
of the weak type, one obtains 


344 



( 7 ) 


where 


/ </ 

[6A T (i - /) + Si T A -6L- 6f T A] dt 
— 6xJ\j — 6v T \l>\ t ? 

= 6tj (A T x)| t ^ - SxqXq - 6X T (x — x) | 1 



( 8 ) 


The right hand side of Eq. (7) contains terms necessary to form all of the proper boundary conditions as natural 
ones. The quantities x and X are discrete values of the states and co-states at the initial (with subscript 0) 
and final times (with subscript /). Depending on the problem, these values will either be specified or left as 
unknowns. 

From Eq. (7), we can directly write down a weak formulation. Before this is done, however, let us examine 
this expression to ensure that it produces the correct Euler- Lagrange equations and boundary conditions. 
Integrating the 6x T term in Eq. (7) by parts and expanding the variation of L , one obtains 


/:{ 


6A r (x - /) - Su 1 


- # *#) T+ t 

-6tj ( 


[©'*©’ 

“]}■ 


d -l) 

dxj 


A + A }di 


— Su 1 tp 


L + X 1 f 4 




+ 6xJ [Xj — Xj ^ — Sx o ^Ao — Ao^ 

+ 6Xj ( x j - xj) - 6Xl (x 0 - *o) = 0 


( 9 ) 


where Xq, A 0 , x/, and A j represent the values of those functions at the initial and final times, respectively. The 
coefficients of 6X T , 6x T , and 6u T in the integrand are the three correct Euler- Lagrange equations, Eqs. 2.8.15 
- 2.8.17 from [9]. There are also six trailing terms in Eq. (9) from which the boundary conditions can be 
determined. The equations corresponding to the first four and the sixth of these terms correspond to the correct 
boundary conditions in [9]. Namely, the requirement for the coefficient of 6v T to vanish yields Eq. (2.8.21). The 
requirement for the coefficient of 6tj to vanish is equivalent to Eq. (2.8.20). The requirement for the coefficient 
of Sx'j to vanish shows that the final value of A equals A j as given in Eq.(8), which corresponds to Eq. (2.8.19). 

If Ao is chosen as zero, the requirement for the coefficient of 6x$ to vanish requires the initial value of A to equal 
zero; on the other hand, the requirement for the coefficient of 6Xq to vanish requires the initial value of x to 
equal x 0 , in accordance with Eq. (2.8.18). Finally, the requirement for the coefficient of <5Aj to vanish demands 
that the final value of x equal the discrete value xy; this has no counterpart in [9] since the elements of xj are 
usually unknown. 

Having satisfied our requirement that none of the fundamental equations are altered, we may now derive 
our weak formulation from Eq. (7). In order to allow for the simplest possible shape functions when we introduce 
a finite element discretization, we do not want time derivatives of x and A to appear in the weak formulation. 
Therefore, we integrate the x term by parts in Eq. (7) yielding 
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— Sx^Xj + 6x^Xq 4 * SXjXj — 6X^xq = 0 


- 6X x — 6 A / — 6u 1 


dt 


- Stj (. 


L+AT/+ §r +,/T ^ 


(10) 


This is the governing equation for the weak Hamiltonian method for optimal control problems. It will serve as the 
basis for the finite element discretization described below. It should be noted that normally one will encounter 
various types of inequality constraints in problems that deal with optimal control. Inequality constraints will 
be the subject of future research. 


Finite Element Solution 

Note in Eq. (10) that time derivatives of 6x and <5A are present. However, no time derivatives of x y A, u 
or 6u exist. Therefore, it is possible to implement linear shape functions for 6x and 6 A within elements and 
constant shape functions for x, A, u, and 6u within elements. 

For simplicity, let us break up the time interval into N segments of equal length At — . Let the values 

of time be given by f * for i = 1, 2, . . . , N + 1 at the points where the time interval is broken, the so-called nodes. 
Here to = fi and tj = Jat+i. Then, introduce a nondimensional elemental time r such that 


t -tj _t-ti 

^i+i — At 


(0 < T < 1 ) 


( 11 ) 


Now, in accordance with the above guidelines, and letting i = 1 , 2 ,..., AT, we can choose simple linear shape 
functions 


6x = <5x(,)(l - r) -f &r( i+1 )r 
6A = £A ( i)(l — r) + ^A( 1+1) r 


where the arbitrary, discrete virtual states and virtual costates are defined at every node point. (The nodal 
indices are enclosed in parentheses to avoid confusion with the state column matrix index.) For the states and 
costates, we can choose even simpler shape functions 


and 


x 


x ( i) if r = 0; 

X(,) if 0 < r < 1; 
x (l>1) if r = 1 


(13) 



if r = 0; 
if 0 < r < 1 ; 
if r = 1 


(14) 


where in both cases, i = 1,2, . . . , N. For u and Su, since their time derivatives do not appear in the formulation, 
we let 
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Su = 6ii(i) 


(15) 


where i = 1,2,...,//. 

Substitution of Eqs. (12) - (15) into Eq. (10) along with Eq. (11) and 


t = U + rAt 
dt — A tdr 


( 16 ) 


yields a set of nonlinear algebraic equations which can be assembled in accordance with standard finite element 
practice. For the sake of brevity, these rather lengthy equations for the general case are not written out explicitly 
here. They are, however, written out in [6] wherein it is noted that there are 2 n(N + 1) equations for the states 
and costates, mN equations for the controls, q equations for the end-point constraints, and 1 equation for the 
unknown time equation. In the assembly process, and drop out of the equations for * = 2, . . . , N (i.e., 
for all but the ends of the time interval to = t\ and tj = f/yr+i). Thus, the total number of unknowns is 2n(A r + 2) 
for the states and costates, mN for the controls, q for the v\ s, and 1 for the unknown time tjv+i . Thus, there are 
2n more unknowns than there are equations, which allows for one to choose, say, x 0 in accordance with physical 
constraints and A/ in accordance with Eq. (8) and solve for the rest. The resulting equations may be explicitly 
perturbed to obtain the Jacobian and solved iteratively by a Newton- Raphson method or by any method that 
is suitable for nonlinear algebraic equations with very sparse Jacobians. 

It is also pointed out in [6] that n(N + 1) of these algebraic equations contain the n(N + 1) unknown 
costates and that these equations are linear in the costates. Thus, the costates can be solved for symbolically 
in terms of the other unknowns, and the remaining equations can be solved circumventing the need for initial 
estimates of the costates. This decreases the size of the problem by approximately a factor of two. 

Example: A Free-Final -Time Problem 

In this section a relatively simple optimal control problem is solved by means of the Hamiltonian weak 
formulation, and the results are compared with the exact solution. Of particular interest is the computational 
effort for varying numbers of elements, the ability of the method to converge for various values of the system 
parameters without needing new initial estimates of the unknowns, and, most important, the accuracy of the 
method versus the number of elements. 



Problem Statement 

The problem is taken from [9], article 2.7, problem 9. A particle of mass m is acted upon by a thrust force 
of magnitude ma. Assuming planar motion and making use of an inertial coordinate system x\ and x 2 as shown 
in Fig. 1, we may write the dynamical equations as 
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(17) 
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1 
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0 1 
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X+ < 

0 

0 

0 

0 

0 

| a cost* 

.0 

0 

0 

0. 


l a sin it « 


where x\ is the horizontal component of position, X 2 is the vertical component of position, x 3 is the horizontal 
component of velocity, and x 4 is the vertical component of velocity. The control is the thrust angle u, and 
x(0) = xo = [0 0 0 0J T . We want to obtain given values h of the final vertical component of position and U 

of the horizontal component of velocity. The final value of the vertical component of the velocity, must vanish, 
but we do not care what the final value of the horizontal component of position is. For the optimal control 
problem, the initial time is taken to be zero, and the final time T is to be minimized so that <f> = 0 and L = 1 
which yields J =T. The elements of the end-point constraint function xp must vanish where 


* 


-{=£*) 


(18) 


In accordance with Eq. (8) 


A/ = 



(19) 


Substitution of these equations into the weak form, Eq. (10), yields a system of algebraic equations whose 
size depends on N . 


Results and Discussion 

These equations are solved by a Newton-Raphson algorithm with trivial initial guesses (that are never 
changed regardless of input parameters) for TV = 2. These results are then used to obtain the initial guesses for 
arbitrary N by simple interpolation. In all results obtained to date for this problem, no additional steps are 
necessary to obtain results as accurate as desired. 

Representative numerical results for x\/h versus x^/h are presented in Fig. 2 for a case with jj? = 0.75. 
Also, the control angle u versus dimensionless time t/T is presented in Fig. 3. Based on other results, not shown 
due to space limitations, accuracy for the costates is comparable to that for the states and for u. It can easily 
be seen that N = 8 gives acceptable results for all variables. 

It should be noted that the computer time on a Cyber 990 is only about 2 seconds for N = 2, AT = 4, and 
N = 8 and 3 seconds for N = 16. Thus, it is relatively insensitive to N. 

Overall, the method provides very accurate results for this problem with only a few elements and for 
minimal computational effort. Furthermore, in results that are not presented herein, the Hamiltonian is seen 
to converge nicely to zero all along the trajectory as the number of elements increases. 
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Fig. 2: Dimensionless vertical position 
/h versus horizontal position x\/h 



An Advanced Launch System Model 


In this section, a model is presented which is suitable for evaluating the potential usefulness of the weak 
Hamiltonian finite element approach in real time guidance of an advanced launch system. A two-stage vehicle 
is considered that is simplified by not allowing for any inequality constraints. 


We confine our attention to vertical plane dynamics of a vehicle flying over a spherical, non-rotating earth 
as depicted in Fig. 4. This results in the following state model for the states m (mass), h (height), E (energy 
per unit mass), and y (flight-path angle): 
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The aerodynamic, propulsion, and atmospheric models are given by the following equations: 

T =T vac - A eP 

p = Po (l - 0.00002255A) 5 256 for h < 11000m 

h — 11000^ for h > 11000l n 


P =Pi l exp 


( J 


6350 ) 

=R e + h- V = ^2(e+£); q= ^ 


p -p o exp 


LJL V 

V 6700 J ’ 


a 


( 21 ) 


D =qS [C do (M) + a 2 C Na (M)] 

C Lo (M) =C Na (M) - C do {M) 

a =a 0 \/l - 0.00002255/j for h < 11000m 
a =295.03ms -2 for h > 11000m 


The vehicle parameters chosen for this model are based on a Saturn IB launch vehicle SA-217 [10] and are 


I spi = 263.4s; 
T vaCl = 8155800N; 
A ei = 8.47m 2 ; 


I, P3 = 430.48 
T vaea = 1 186200 N 
A t , = 5.29m 2 ; S = 33.468m 2 


( 22 ) 


where subscripts “1” and “2” refer to the first and second stages respectively. The aerodynamic coefficient data 
Cdo and Csa are presented as functions of the Mach number M in Tables 1 and 2. The physical constants 
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used in the above model are the earth’s gravitational constant p — 3.9906 x 10 14 m 3 s 2 , the earths mean 
radius R t = 6.378 x 10 6 m, the sea-level atmospheric pressure p 0 = 101320Nm -2 , the atmospheric pressure at 
11km pn = 22637Nm -2 , the sea-level density of air p 0 = 1.225kg m -3 , and the sea-level speed of sound in air 
a 0 = 340.3ms - 1 . 


M 

Cd o 

0.20 

1.00 

0.75 

0.45 

0.98* 

0.80 

1.00 

0.80 

1.02* 

0.80 

3.50 

0.20 

6.00 

0.02 


Table 1: Aerodynamic coefficient Coo versus Mach number 
(* denotes a common end point of two quadratic polynomial curves) 


M 

Cnc 

0.00 

6.20 

0.50 

6.35 

0.98* 

7.70 

1.00 

7.70 

1.02* 

7.70 

2.50 

5.20 

4.40* 

4.70 

5.00 

5.50 

6.00 

6.00 


Table 2: Aerodynamic coefficient Cjsf a versus Mach number 
(* denotes a common end point of two quadratic polynomial curves) 


The performance index is 


j = 4> \ tj = H, 


(23) 


and the final time tj is open. The initial conditions specified are m(0) = 5.2 x 10 5 kg, h( 0) = 1800m, E( 0) = 
—6.25 x 10 7 m 2 s -2 , and 7(0) = 75°. The final energy is specified as E{tj) = —1.25 x 10 7 m 2 s -2 . The burnout 
mass of the first stage is 192000 kg and the drop-mass of the booster is 51000 kg. 

For this two-stage model, we must modify our formulation somewhat. We must accomodate for the unknown 
staging time the constraint on the mass at t s (as opposed to a constraint on the states at the final time), 
the jump in the mass at t S} the jump in the mass costate at t ti the condition for a continuous Hamiltonian at 
t s (continuous since <f> and x p are not explicit functions of time), and finally the change in state equations at 
i $ (due to the change in thrust). We further point out that the control u is discontinuous at the staging time. 
However, since only discrete mid-point values of u are solved for, the jumps are allowed to occur automatically 
at the nodes. 

The new performance index (with L — 0) is 


-j: 


[X T (i - fi)dt+ f [A T (i - / 2 ) dt - i/i V»i 

Jt. 




-j/ 2 V > 2 


t. 


(24) 
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where f\ and fi represent the state equations before and after t 8 respectively, \j)\ — m(t g ) — 192000 and 
V> 2 = E(tj) + 12500000. 


The weak formulation is derived exactly as before and the same shape functions can be employed. This 
time, however, we will discretize the time from 0 to t s with N\ elements and the time from t s to tj with N 2 
elements. The algebraic equations shown in [6] for the weak formulation are readily modified to account for the 
various jumps and state equation discontinuities. The resulting equations are 


tk 


i= 1 


+ 6 x T + 1 


-6u 


— A 

1 

i> 



2 

[dxj, 


A<i 

( d A V 


“ 2 

\dx). 

Ati 

m 

T ' 

A, 

} 


[duj 

i 

/ 


+6xko — ^ x Ni + l (k,+l ^Afi + l) + ^^Ni + 1 ( 

(i)>l 


X Ni + l ^Ni + l) 


Ny+N? ( 

+ Z) I 6 *! 

i=Nx+l l 


+«*r+i 


-**•[*■ + ¥t«< 

AH (H) A, j-Sij [//(I,)] - Ai-j (e, + 12500000) 


— SuJ 

L ' \ au ;> 

\T /j»_\ z~T 


+£A Nl+Na+1 (£/) - <fojV,+;v 2 +l(A() - 0 


(25) 


From Eqs. (8) and (23) it is seen that Ay is given by [1 0 1^2 0J T . Also, we note that the only jumps 

are in the mass state and the mass costate and these jumps are 


m(t s ) - m(*+) = 51000 

A m (*7) - *m(<J) = Vl 


The finite element equations are solved using the method of Levenberg-Marquardt as coded in the IMSL 
subroutine ZXSSQ [11]. Running a case for a few elements generates a good approximation for larger numbers 
of elements. Initial guesses do not need to be very accurate, but the method is not nearly as computationally 
efficient as a Newton-Raphson procedure where sparsity in the Jacobian could be exploited. 

In Figs. 5 and 6 numerical results for the ALS model are given for 4 and 8 elements in each time interval. 
(The number of elements in each interval is completely arbitrary.) In Fig. 5 the altitude profile is shown and 
the control history is shown in Fig. 6. From past experience with the one-stage model, we believe the N\ = 8 
and N? — 8 to be a converged result. Furthermore, we see that even the N 1 = 4 and = 4 result gives a 
reasonable approximation to the solution. It should be noted that these results are not realistic because of the 
absence of state constraints, and because we have large angles of attack (more than 30° at some points) even 
though we assumed small angles in the state equations. However, they do suffice to illustrate the power of the 
method. 
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As an indication of the accuracy of the method in a global sense, the Hamiltonian was observed to converge 
to zero (the exact answer) all along the trajectory. The finite element results are converging to the exact solution 
as N increases. 


Conclusion 

In this paper we present a weak Hamiltonian formulation for optimal control problems. Results are pre- 
sented based on the weak Hamiltonian finite element formulation for a simple optimal control problem which 
show the method is reliable, efficient, and accurate. For this and other problems it is easily programmed with 
a self-starting algorithm. 

To address the future needs of real-time guidance for future space transportation systems, we present a 
four-state model for the dynamics of mass, altitude, energy, and flight-path angle. The angle of attack is 
the control. The results show the power and efficacy of the present approach. It has several advantages for 
applications in real-time guidance. It is not only reliable and efficient but has excellent self-starting capability. 
Furthermore, initial guesses of the costates are not needed, and the method exhibits a graceful degradation in 
performance with reduction in number of elements. 

For future research we intend to concentrate on improving computational efficiency by exploitation of the 
relatively significant level of sparsity in the Jacobian. We also will incorporate inequality constraints, and begin 
to address the avoidance of atmospheric anomalies. 
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Abstract 

In the design and analysis of robust control systems for uncertain plants, the technique of formulating what 
is termed an "M-A model" has become widely accepted and applied in the robust control literature. The "M" represents 
the transfer function matrix M(s) of the nominal system, and "A" represents an uncertainty matrix acting on M(s). The 
uncertainty can arise from various sources, such as structured uncertainty from parameter variations or multiple 
unstructured uncertainties from unmodeled dynamics and other neglected phenomena. In general, A is a block diagonal 
matrix, and for real parameter variations the diagonal elements are real. As stated in the literature, this structure can 
always be formed for any linear interconnection of inputs, outputs, transfer functions, parameter variations, and 
perturbations. However, very little of the literature addresses methods for obtaining this structure, and none of this 
literature (to the authors’ knowledge) addresses a general methodology for obtaining a minimal M-A model for a wide 
class of uncertainty. Since having a A matrix of minimum order would improve the efficiency of structured singular 
value (or multivariable stability margin) computations, a method of obtaining a minimal M-A model would be useful. 
This paper presents a generalized method of obtaining a minimal M-A structure for systems with real parameter 
variations. 


l. Introduction 

Robust control theory for both analysis and design has been the subject of a vast amount of research in this 
decade [1-35]. In particular, robust stability and performance have been emphasized in much of this work, as, for 
example, in the development of H°° control theory [10-15, 19-23]. Moreover, the development of robust control 
system design and analysis techniques for unstructured [1-9, 13, 19, 21] as well as structured [16-35] plant uncertainty 
continues to be the subject of much research - particularly the latter. Unstructured plant uncertainty arises from 
unmodeled dynamics and other neglected phenomena, and is complex in form. This uncertainty is called "unstructured" 
because it is represented as a norm-bounded perturbation with no particular assumed structure. Plant uncertainty is 
called "structured" when there is real parameter uncertainty in the plant model, or when there is unstructured complex 
uncertainty occurring in the system at multiple points simultaneously. Plant parameter uncertainty can arise from 
modeling errors (which usually result from assumptions and simplifications made during the modeling process and/or 
from the unavailability of dynamic data on which the model is based), or from parameter variations that occur during 
system operation. 

Robust control design and analysis methods for systems with unstructured uncertainty is accomplished via 
singular value techniques [1-9, 21 ]. For systems with structured plant uncertainty, however, the structured singular 
value (SSV) [16-27] or multivariable stability margin (MSM) [28-33] must be used. In order to compute the 
SSV or MSM, the system is usually represented in terms of an M-A model. The ”M" represents the transfer function 
matrix M(s) of the nominal system, and "A" represents an uncertainty matrix acting on M(s). In general, A is a block 
diagonal matrix, and for real parameter uncertainites the diagonal elements are real. As indicated in the literature 
[17,18,20,28 ], this structure can always be formed for any linear interconnection of inputs, outputs, transfer 
functions, parameter variations, and perturbations. However, very little of the literature discusses methods for 
obtaining an M-A model. For unstructured uncertainites, this model is very easy to obtain. However, for real 
parameter variations, forming an M-A model can be very difficult. In [29 ], De Gaston and Safonov present an M-A 
model for a third-order transfer function with uncertainty in the location of its two real poles and in its gain factor. 
Although the given M-A model is easily obtained for this simple example, other examples do not yield such a 
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straight-forward result. A general state model of M(s) for additive real perturbations in the system A matrix (where A 
is assumed to be closed- loop) is discussed in [34]. Unfortunately, this model is not general enough for many 
examples, since system uncertainty is restricted to the A matrix and the uncertainty class is restricted to be linear. 
Morton and Me Afoos [26 ] present a general method for obtaining an M-A model for linear (affine) real perturbations 
in the system matrices (A,B,C,D) of the open-loop plant state model. In this model, an interconnection matrix P(s) is 
constucted first for separating the uncertainties from the nominal plant, and then M(s) is formed by closing the 
feedback loop. The M-A model thus formed can be used in performing robustness analysis of a previously determined 
control system. If the feedback loop is not closed, ji-synthesis techniques [19-21] can be applied to the M-A model for 
robust control system design. Morton’s result essentially reduces to that of [34] when the perturbations occur only in 
the A matrix (and the A matrix of [34] is assumed to be open-loop). An algorithm for easily computing M(s) based 
on Morton’s result is presented in [35]. Although this method of constructing an M-A model is general for linear 
uncertainties, many realistic problems require a more general class of uncertainties. Furthermore, no consideration is 
given to obtaining a minimal M-A model, where "minimal” refers to the dimension of the A (or M) matrix. Since the 
M-A model is a nonunique representation, it would be prudent to obtain one of minimal dimension so that the 
complexity of the SSV or MSM computations during robust control system design or analysis could be minimized. 
However, none of the literature (to the authors’ knowledge) addresses the issue of minimality. 

This paper presents a methodology for constructing a minimal M-A model for systems with real parametric 
multilinear uncertainties, where the term "multilinear” is defined as follows: 

Definition : A function is multilinear if the functional form is linear (affine) when any variable is allowed to vary 
while the others remain fixed. For example, f(a,b,c) = a + ab + be + abc is a multilinear 
function. 

Thus, the allowance of multilinear functions of the uncertain parameters provides a means of handling cross-terms in 
the transfer function coefficients. A procedure is proposed for obtaining this model in state-space form for uncertain 
single-input single-output (SISO) systems, given the system transfer function in terms of the uncertain parameters. 

An extension of this result to multiple-input multiple-output (MIMO) systems will be given in a subsequent 
publication. In this development, M(s) will represent the nominal open-loop plant, so that the resulting M-A model 
may be used for robust control system analysis or design. The state-space form used in modeling M(s) is an extension 
of Morton’s result for real parametric linear (affine) uncertainties [26]. The paper is divided into the following 
sections. A formal statement of the problem to be solved in this paper is presented in Section 2, followed by a 
discussion of minimality considerations in Section 3. The approach is presented in Section 4, a proposed solution to 
the problem is presented in Section 5, and the proposed procedure for finding a minimal M-A model is summarized in 
Section 6. Several examples demonstraing the proposed solution are given next in Section 7, followed by some 
concluding remarks in Section 8. 


2 . Problem Statement 

Given the transfer function of an uncertain system, G(s,5), in either factored or unfactored form, as a 
function of the uncertain real parameters, 8, find a minimal M-A model of the form depicted below in Figure 1: 


G(s, 5) 



such that: 


1. The diagonal uncertainty matrix, A, is of minimal dimension. 

2. The model of the nominal plant, M(s), is in state-space form. 
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The model must handle multilinear uncertainty functions in any or all of the transfer function coefficients. In 
order to construct a minimal M-A model, the dimension of the A matrix must be minimized. Hence, factors which 
have been found to affect the dimension of the M-A model will be discussed next, followed by the approach used in 
forming a solution to this problem. 


3. Minimalit y Considerations 

In constructing an M-A model of an uncertain system, the A matrix can become unnecessarily large due to 
repeated uncertain parameters on its main diagonal. It is therefore of interest to examine the factors which can cause 
this repetition, so that the number of repeated uncertain parameters can be minimized. A factor which can be shown to 
cause unnecessary repetition in the A matrix is the particular realization used in representing the system. Examples 
can be constructed which demonstrate this effect, and it appears that a cascade realization (and, in particular, cascaded 
uncertain real poles and zeros) is a desirable form for obtaining a minimal M-A model. Thus, a general cascade-form 
realization will be part of the approach taken in constructing a minimal M-A model. A problem arises, however, in 
that some transfer functions have a form which precludes cascading uncertain real poles or zeros, such as: 

G( S ,S) = b l s 2 ± b? s + bj G(s, 8 ) = (s + 9i)(s + e 2 ) 

(s + 0 i)(s + 02 ) , s 2 + ai s + a 2 

where 0j and 02 are assumed here to be uncertain (and hence a function of 5). Cascading the poles and zeros for either 
case would result in improper transfer function blocks to be realized. For these cases, it is unavoidable for the 
minimal A matrix to have repeated uncertain parameters on the main diagonal. However, for each inseparable pole or 
zero pair it is only necessary to repeat one uncertain parameter. This issue will be addressed in the proposed solution, 
and a minimal M-A model for the first transfer function above will be given as an example. 

Another factor which affects the dimension of the M-A model is the form of the coefficients in the system 
transfer function. If any of the coefficients is a nonlinear function of the uncertain parameters instead of a multilinear 
function (e.g., there are squared uncertain terms in any of the coefficients), then extra dependent uncertain parameters 
must be defined in order to represent these terms in a multilinear form. For example, 5i 2 would be represented as 
8 ^ where 82 = 8 j, and both 8 ! and 8 2 would appear in the A matrix. Thus, for this case, it is again necessary that 
the minimal A matrix contain repeated uncertain parameters on its main diagonal. An example illustrating this 
situation will be presented later. 

These issues are addressed in the proposed solution for constructing a minimal M-A model. The approach 
taken in forming this solution is described in the next section. 


4. Approach 

Based on the problem definition and the minimality considerations outlined above, several issues will be 
addressed in forming a solution to the problem of constructing a minimal M-A model given the transfer function of an 
uncertain system. First, a general cascade-form realization will be found which can be used to obtain a minimal M-A 
model. Second, the minimal A matrix will be determined for any uncertain system such that extra dependent 
parameters are assigned to account for inseparable pairs of uncertain real poles or zeros as well as non-multilinear (e.g., 
squared) terms. Third, a method of obtaining a state-space realization of M(s) for any uncertain system will be found. 
Therefore, the proposed approach for constructing a minimal M-A model is given as follows: 

1. Obtain a cascade-form realization of the system so that the state-space uncertain model can be written as: 

x = A x + B u (1) 

y = C x + Du 

where: A = A q + [AA], B = B q + [AB], C = C q + [AC], D = D q + [AD] (2) 

The terms with the "o" subscript (A q , B q , C q , D q ) represent the nominal matrix components, and the "A" terms 

(AA, AB, AC, AD) represent the uncertain matrix components. To eliminate confusion of the A notation, the 
diagonal uncertainty matrix, A, of the M-A model will be represented as [A], and the AA, AB, AC, and AD matrices 
will be represented as [AA], [AB], [AC], and [AD] wherever clarification is required. 
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2. Obtain a minimal M-A model as described in the problem definition and pictured in Figure 1, where: 
a. The minimal uncertainty matrix. A, is defined as: 

A = diag [ 5 r 6 2> 5 3 6 m ] = diag [ 5j , Sp ] = diag [ 6 ] (3) 

where; A 6 R mxm , 8i e R m ‘ , 8 D € R mD , 8 g R m 

and: m = the minimal number of uncertain parameters 

mj = the number of independent parameters given in G(s,8) 
mj) = the minimal number of dependent (or repeated) parameters 

Also: p = [A] q (4) 

where: p = the uncertain parameters input to M(s), P e ^ mp 

q = the uncertain variables output from M(s), € 

Since an M-A model is minimal if the dimension of the A matrix, m, is minimal, where m depends 
on mj and m D , with mj being given and fixed, a formal definition of a minimal M-A model can be 
stated as follows: 


Definition : An M-A model is minimal if m D - i.e., the number of dependent (or repeated) parameters in the A matrix - 
is minimal (or zero, if possible). 


b. The state-space model of the nominal plant, M(s), is an extension of Morton’s result [26] and has 
the following form: 




( 5 ) 


where B xp , Cq X , Dqp, Dq U , and Dyp are constant matrices. Thus, M(s) can also be written in the 
equivalent shorthand notation defined as follows: 



M 11 (s) 

M 12 (s) 


A o 

B XP 

B o“ 

M(s) = 

M 21 (s) 

M 22 (s) 


C 

L C o 

D <P 

D yp 

D <n 

°o_ 


where: M n (s) = q(s)/p(s) = C qx (si - Aq)' 1 B xp + 

M 12 (s) = y(s)/p(s) = C qx (si - Aq) 1 B 0 + D qu 

M 21 (s) = q(s)/u(s) = C D (si - Aq)* 1 B xp + D yp 

M 22 (s) = y(s)/u(s) = C 0 (sl - Aq)' 1 B g + D 0 


( 6 ) 


(7) 


It should be noted that in [26] the matrix was required to be zero. In this paper, however, is 
allowed to be nonzero in order to model the multilinear (cross-product) uncertain terms. 

The results for constructing a minimal M-A model via this approach are presented in the next section. 
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5. Proposed Solution 

The proposed solution will be presented in two parts: the results for obtaining a cascade-form realization of 
the uncertain system will be summarized first, followed by the results for obtaining the state-space realization of a 
minimal M-A model. 

5.1 Cascade-Form Realization 

Given the transfer function of an uncertain system in terms of its uncertain parameters, G(s,8), it is desired to 
realize the system in a cascade form of first- and second-order subsystems. Thus, if the transfer function is given in 
unfactored form, the numerator and denominator polynomials must be factored into first- and second-order terms. The 
given transfer function will then be represented as follows: 

G(s,8) = K y (8) G c (s,8) G R (s,8) K u (8) (8) 

where K u and K y represent input and output gain terms, respectively, and Gr and G<^ represent the real (first-order) and 
complex (second-order) transfer function components, respectively. Then : 


Gr(s, 8) = G Rk (s,8)G Rk ^(s.8) ... Gr 2 (s,8)G Ri (s,8) 

(9) 

G c (s,8) = G Cl (s,8)G c , 1 (s,8)...Gc 2 (s,8)G Ci (s,8) 

(10) 

Grj (s, 8) = P 2M S + P 2i - 

(ID 

S + <Xi 

G c _ b3i-2 s 2 + b3i-i s + b3i 


s 2 + aa-i s + a 2 i 

(12) 


and: k = number of real (first-order) blocks 

1 = number of complex (second-order) blocks. 

Any or all of these transfer function coefficients may be uncertain. The uncertainty may arise from either the 
coefficient itself being uncertain, or from the coefficient being a function of one or more uncertain variables. 
Therefore, for either case, any of the coefficients may be a function of 8. Furthermore, the uncertain variables may 
have either an additive or multiplicative form: 

e = e Q + Sfc , e = e o (1+S E ) (13) 

The following cascade-form state-space realization of this system is proposed: 


G(s) = 

a R 

b C c R 

0 

AC 

B R K u 
EC Er K u 


Ky D c C R 

K y c c 

K y C C D R K u 


where: 
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Ar = 


A Rl 

B R 2 c R 1 

B R 3 Dr 2 CR 1 

"*4 D R 3 Dr 2 Cr 1 


b r 3 c r 2 

B R, D Ri C R 


BR k-l DR k-2" D R 2 C R 1 B Rk-l DR k;.2" Db 3 Cr 2 ' 

E R k I Vi' D R 2 C R I B R k D R k .r D R 3 C R 2 • 


B R k CR k-i A Rk_ 


b r = 


B R D R 
K 2 R 1 

b R„ D Ro 


2 R j 
>R 2 D R, 


BR k-l° R k-2 ' DR 2 D R j 

_ B R k DR k-r - DR 2 D Ri _ 

C R [ D R k DR k-i' D R 2 c Ri D R k DR k-r‘ D R 3 c R 2 ••• DR k CR k-i C Rk] < 17 > 

°R " L D R k DR kl ■DrjDrJ (18) 

The A^, B(\ Cq, and matrices have the exact same form as (15) - (18), except that the subscripts "R" and ”k” are 

replaced by "C” and "1”, respectively. The submatrices are defined as follows: 


A Ri = -<Xi 

C Ri = P2i ~ «i p2i-l 

A 0 1 t 

Aq = j 

L -a2i -a2i-i J 

Cq = [ (t> 3 i - a 2 i b 3 j. 2 ) (b 3 i-i - a 2 i-i b 3 i. 2 ) ] 


B Ri = 1 
D Ri = P2H 

Bq = [° 

1 J 

Dq = b 3 i _2 


The realizations { Ar., Br., Cr., Dr. ) and { Aq, Bq, Cq, Dq ) represent the i* real (first-order) and complex 
(second-order) systems G R .(s,8) and Gq(s,8), respectively. Thus, for the real subsystems, i = 1, 2, ... , k, and for the 
complex subsystems, i = 1, 2, .... 1. 
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The resulting cascade-form realization of the uncertain system is therefore given from (14) as: 


Ar 0 
Be Cr Ac 


BrK u 
. Be Dr Ku . 


C = [ Ky Dc Cr K y Cc ] D - Ky Dc Dr K u 


(21) 


The above model is a general cascade-form realization for any uncertain open-loop SISO transfer function. 

The model does not, however, handle nonmonic denominator polynomials with uncertain leading coefficients. This 
would result in fractional (i.e., rational) matrix elements in the realization with uncertain parameters in the 
denominator of these elements. For real uncertain poles or zeros, two factors determine whether the real (first-order) or 
complex (second-order) block form should be used. The first is the nature of the uncertainty associated with these 
terms, and the second is the form of the transfer function. If the real pole or zero locations are the uncertain parameters 
and the transfer function form allows these poles or zeros to be separated out, then the real block form should be used. 
If the transfer function form does not allow this separation, then the complex block form must be used. Furthermore, 
if there is a pair of uncertain poles or zeros that cannot be cascaded, then the resulting minimum A matrix will have a 
repeating parameter on the main diagonal for each inseparable pole or zero pair. Alternatively, if the coefficients of the 
second-order polynomial associated with the real poles are the uncertain parameters, then the complex block form 
should be used. These cases will be illustrated in the Examples section of this paper. The formulation of the 
minimal M-A model will be presented next. 


5.2 Minimal M-A Model 

In formulating the minimal M-A model, the minimal A matrix must be determined first, followed by the 
state-space realization of M(s). Thus, the results for formulating this model will be presented in this order. 

5.2.1 Minimal A Matrix 

The minimal A matrix is defined as in (3) with: 

m = mj + mj) (22) 

where mj is the number of independent uncertain parameters, and m D is the number of dependent uncertain parameters 
that must be added. The uncertain independent parameters are those defined in G(s,5). However, as discussed 
previously, the dependent uncertain parameters are those independent parameters that must be repeated due to non- 
multilinear terms in the transfer function coefficients and/or pairs of uncertain real poles or zeros that cannot be 
cascaded. Thus, for A to be minimal, m D (or 8 D ) should be minimized. It can be shown that if the system transfer 
function is formed from a given minimal M-A model of an uncertain system, the coefficients of the numerator and 
denominator polynomials will be multilinear functions of the uncertain parameters. Unfortunately, the converse is not 
necessarily true in general because of the dependence of the M-A model on the realization used for the plant. If the 
general cascade-form realization posed in this paper is used, however, the multilinear form of the transfer function 
coefficients can be used to establish that m = m T (i.e., m D = 0), unless there are real uncertain pairs of poles or zeros 
that cannot be cascaded. Furthermore, it can be shown that if the coefficients of all the factors of the numerator and 
denominator polynomials are multilinear functions, then the coefficients of the expanded polynomials will also be 
multilinear. However, if there are non-multilinear uncertain terms in the transfer function, then dependent parameters 
must be defined (and added to A) to represent the non-multilinear term in a multilinear form. Moreover, if the non- 
multilinear term is of the form 8 n , then n-1 dependent parameters must be defined. If there are pairs of real uncertain 
poles or zeros that cannot be cascaded, then one additional dependent parameter must be added for each pair, and the 
dependent parameter can be either of the uncertain real parameters in the pair. Therefore, the number m, as determined 
by these rules, is the minimal dimension of the A matrix for the uncertainty class considered in this paper. Once this 
minimal dimension is determined, the A matrix can be defined as a diagonal matrix, as in (3), with the specified 
uncertain parameters on the main diagonal. Examples which illustrate these cases will be presented later in Section 6. 

5.2.2 State-Space Realization of M(s) 

Once the cascade-form realization has been determined, the system can be modeled as in (1) and (2), where 
[AA], [AB], [AC], and [AD] are known functions of the uncertain parameters. Since any non-multilinear terms have 
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been redefined in a multilinear form when the minimal A matrix is determined, these matrices are multilinear functions 
of the parameters. In order to obtain a state-space model for M(s) as defined in (5), expressions for these uncertainty 
matrices must be determined in terms of the matrices B X p, Cqx> Dqp, Dq U , and Dyp from the model. Using (4) and 
(5), these expressions are determined as follows: 

[AA] = B X p [A ]( I - [A ] T l C qx = B xp (I-[A]D qp )'* [A ] C qx 

[AB] = B xp [A ] ( I - [A ] )'* D qu = B xp ( I - [A ] D^ )-» [A ] D qu (23) 

[AC] = Dyp [A ] ( I - Dqp [A ] ) _1 C qx = D yp ( I - [A ] D^ Y l [A ] C qx 

[AD] = Dyp [A ] ( I - Dqp [A ] )'* D qu = Dyp ( I - [A ] Dqp )'* [A ] D qu 

The inverse term makes computation of Dqp very difficult. Furthermore, the matrix inversion can cause [AA], [AB], 
[AC], and [AD] to have fractional (i.e., rational) elements with uncertain parameters in the denominator, which is not 
allowed in the uncertainty class being considered. Thus, it is desirable to represent this term in expanded form as 
follows: 


( I - [A ] Dqp)-1 = I + [A ] Dqp + ( [A ] Dqp)2 + ( [A ] Dqp )3 + . . . (24) 

where the latter form in (23) has been assumed. Then the above equations can be rewritten as: 

[AA] = B X p [A ] Cq X + B X p ( [A ] Dqp + ( [A ] Dqp )^ + ( [A ] Dqp )^ + ... } [A ] Cq X 

[AB] = B X p [A ] Dq U + B X p ( [A ] Dqp + ( [A ] Dqp )^ + ( [A ] Dqp )3 + ... } [A ] Dq U (25) 

[AC] = Dyp [A ] Cq X + Dyp { [A ] Dqp + ( [A ] Dqp )2 +([A]Dqp)3 + ... } [A ] C qx 

[AD] = Dyp[A]Dq U + Dyp{ [A]Dqp + ([A]Dqp ) 2 +([A]Dqp)3 + ... } [A ] D qu 

The second group of terms add in the cross-terms of the multilinear uncertainty functions. Each term in the series adds 
a higher-order cross-product term. Since [AA], [AB], [AC], and [AD] are multilinear functions with a finite number of 
terms, the Dqp matrix can be defined to have a special nilpotent structure such that: 

Dqp r+1 =0 (26) 

and; ( I - [A ] D qp r 1 = I + [A ] Dqp + ( [A ] Dqp ) 2 + . . . ( [A ] Dqp ) r (27) 

where r is the order of the highest cross-term occuring in [AA], [AB], [AC], and [AD], i.e.: 

r = max ( 0 A , 0 B , Oc, 0 D ) (28) 

where O a , Og, Oq, and Op represent the order of the highest-order cross-product term in [AA], [AB], [AC], and [AD], 
respectively. Cross-product term order is defined as: 

order ( 8 j 82 83 . . . 8 j ) = i - 1 (29) 

where i = 1, 2, . . . , m. Thus, the maximum value of r is r max = m- 1 , where m is the dimension of the A matrix. 
The required structure for Dqp to satisfy (26) and (27) is given as follows: 

1 . ) dji = 0 ; i = 1 , 2 , . . . , m 

2. ) If djj * 0 , then for i = 1, 2, . . . , m and j = 1, 2, . . . , m : (30) 

a. ) djj = 0 ; 

b. ) = 0 ° r di® 2 j ®2 = 0 or ... or di 0 ( m _j)j 0 ( m _i) = 0 

where the symbol "®" represents "modulo m” addition. The desired equations can therefore be written as: 
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(31) 


[AA] = B xp [A ] C qx + B xp { [A ] Dqp + ( [A ] Dqp ) 2 + . . . ( [A ] D^ ) r ) [A ] C qx 

[AB] = B xp [A ] D qu + B xp { [A ] Dqp + ( [A ] Dqp ) 2 + ... ( [A ] Dqp ) r ) [A ] D qu 

[AC] = D yp [A ] C qx + D yp { [A ] + ( [A ] ) 2 + . . . ( [A ] Dqp V } [A ] C qx 

[AD] = Dyp [A ] D qu + Dyp { [A ] Dqp + ( [A ] Dqp ) 2 + . . . ( [A ] Dqp Y ] [A ] D q jj 

where V is defined in (28). Since the [AA], [AB], [AC], and [AD] matrices are known for the given system, the 
equations in (31) are used to determine B xp , C qx> D qu , Dyp, and D qp . Once these matrices are obtained, the state- 

space model of M(s) is found. Hence, a minimal M-A model has been formed. 

A procedure which summarizes the necessary steps in obtaining a minimal M-A model using these results is 
presented next 


6. Summary of Procedure 

The following is a summary of the procedure implied by the above proposed approach for forming a minimal 
M-A model of a given uncertain system: 

i. ) Obtain the system transfer function in factored form. The coefficients of each factor should be a multilinear 

function of the uncertain parameters. If necessary, define new dependent parameters to represent any non- 
multilinear terms in a multilinear form. 

ii. ) Define the number of parameters in the A matrix, m, using (22). In so doing, determine if any new parameters 

are required to model inseparable uncertain real pole or zero pairs. If there are inseparable real pairs, either 
uncertain parameter in the pair may be repeated. 

iii. ) Define the minimal A matrix as in (3), using the independent parameters defined in the given transfer function 

as well as those defined in steps i.) and ii.) above. 

iv. ) Obtain a cascade-form realization for the system as a function of the uncertain parameters. 

v. ) Express the system matrices as in (2). 

vi. ) Determine the maximum order of cross-product terms, r, in [AA], [AB], [AC], and [AD] as defined by (28) and 

(29). Then [AA], [AB], [AC], and [AD] have the form represented in (31), where Dqp has the special 

(nilpotent) structure summarized by (30). 

vii. ) Express [AA], [AB], [AC], and [AD] as: 

[AA] = [AA 0 ] + [AAj] + [AA 2 ] + . . . + [AA r ] 

[AB] = [AB 0 ] + [ABj] + [AB 2 ] + . . . + [AB r ] (32) 

[AC] = [AC 0 ] + [AC!] + [AC 2 ] + ... + [AC r ] 

[AD] = [AD 0 ] + [AD]] + [AD 2 ] + . . . + [AD r ] 

where the subscript i represents the cross-terms of i* order in each uncertainty matrix. 

viii. ) The B xp , C qx , Dyp, and D qu matrices are found using the expansion described in [26] for the uncertainty 

matrices having zero-order cross-product terms; i.e. define: 


M = 


AAq AB 0 


|_AC 0 AD 0 


Mj 8i + M 2 8 2 + — + M m 8m 


(33) 


where the Mj matrices are appropriately partitioned. For the case of repeated parameters due to inseparable real 
poles or zeros, the M, matrix associated with the repeated parameter must be nonzero . These matrices can be 
decomposed into the product of appropriately partitioned column and row matrices as follows: 
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Mi = 


M Bi 

MdJ 


[ Mq Mps] 


(34) 


where Mg. forms the i^ 1 column of B X p, Mj forms the i* column of Dyp, forms the column of 
Cq X , and forms the i^ 1 column of Dq U . Thus: 


Bxp = [ 

M Bl 

M B2 ••• 

MbJ 

Dyp = [ 

m D„ 

Md 12 • • • 

Md,J 

P 

II 

Mq T 

Mc 2 t 

Me.] 

e 

II 

r 1 

MdJ 

••• 

MdJ 


(35) 


ix. ) Use the higher-order cross-terms of [AA], [AB], [AC], and [AD], as in (32), to determine the elements of the 

Dqp matrix. Begin with the first-order terms and specify as many elements as possible. Continue with the 
second-order terms, and proceed until all elements of Dqp are specified. Check Dqp to ensure that the required 
special structure of (30) and, hence, (26) is satisfied. 

x. ) Form the minimal M-A model as given in (3), (5) and (6), and as pictured in Figure 1. 

It should be noted that the matrices Mg^, M^;., and obtained in decomposing the Mj matrices 

in (34), are not necessarily unique. A method of formalizing this decomposition for computer implementation will 
not be addressed in this paper. However, an algorithm is presented in [35] which accomplishes this decomposition for 
a more restrictive uncertainty class. Some examples will be given next to illustrate these results. 


7. Examples 

The following examples illustrate the proposed procedure presented above. Due to space limitations, details 
of each step will not be included. However, the results that are presented for each example should be fairly easily 
obtained. 

Example 7.1 

Consider the following uncertain system: 

G(s 5) _ (Pi S + P 2 XP 3 s + p 4 )(bl S 2 + b 2 S + b 3 > 

(s + oci)(s + ot2)(s 2 + ai s + a2> 

where: 04 = a lo + 8 ai . a 2 = a2 0 + &a 2 • a l = a l 0 + 5 ai > a 2 = % + 8 a 2 

Pi = Pl 0 + 8 3! • P2 = P2 0 + 8 P 2 • P 3 = P 3 0 + 8 P 3 » P4 = P 4 0 + 8 p 4 

b l = b l 0 + 8 bi > *>2 = b 2 0 + 8 b 2 • b 3 = b 3 0 + 8 b 3 
The cascade-form realization of this example is found in a straight-forward manner to be: 






1 

-ai 

0 

0 

0 


Pi 

P2-CX1P1 

- 0-2 

0 

0 

B = 

0 

0 

0 

1 


0 

. P3(p2-«ipi) 

P 4 - 012 P 3 

-a 2 

-ai . 


. P 1 P 3 _ 
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[ 0 0 0 0 1 0 pi„ 0 piop3o 0 0] 




Example 7.2 


This example illustrates the case of an inseparable uncertain real pole pair. Consider the following system: 

G(s, 5) = b i s2 + *>2 s ± ** 

(s + 0 l)(s + 02) 

where: bi = W + 8 b , b2 = t>2 + 8b 2 • b 3 = b 3 0 + 5 b 3 

0i = 0 i o + 5 01 , e 2 = 0 2q + 5 e 2 

Since the numerator is second order with uncertain coefficients, the uncertain real poles in the denominator cannot be 
separated into the real cascade form. The denominator must therefore be expanded, and the complex (second-order) 
block used in the realization, which is given as follows: 

A-f ° * 

-010 2 "(01+02) 

C = [ (b 3 - 0i0 2 bi) (b 2 - (0i+0 2 )bi)] . D = [bi] 

Since there is one inseparable uncertain pair of poles, either 8 0 j or 892 must be repeated in the A matrix. (It can be 
shown that if this is not done, Dqp will not have the required structure and hence the higher-order cross-product terms 
will not be modeled correctly.) Since there are no non-multilinear terms in any of the transfer function coefficients, 
m = 6 (i.e., five given independent uncertain parameters plus one dependent parameter for the inseparable pole pair). 
The resulting A matrix can therefore be defined as follows: 

A = diag [80^80^502, 8b r 6b 3 l 



where 8 6l was arbitrarily chosen to be repeated. Using the proposed procedure, the following results are obtained: 


Dqp — 


Bxp - 


Cqx ~ 


0 

0 

l/02o 

1 

0 

0 


0 0 0 0 0 0 
L-l -1-1 0 0 0 


Dyp — [~bi 0 -bio -bio -1 1 1] 


02o 

0 

01o 

01o02o 

0 

1 

0 

0 

0 

1 

0 

0 


0 

1 

1 

(01o+02o) 

1 

0 


0 

0 

0 

1 

0 

0 


0 

0 

0 

0 

0 

0 


Dq U — 


0 
0 
0 
-1 
0 

L 0 J 


0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 

0 
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Example 7.3 


This example illustrates the case of non-multilinear terms in the transfer function. Consider the second-order 
uncertain system: 


G(s,8) = 


1 


s 2 + 2 ct s + 0) 2 


where: 


ct = a +5 
o a 


to = co +5 
o co 


This is a second-order system with uncertain complex poles. The uncertainty appears in the real and imaginary 
components of the complex poles. The constant coefficient, to 2 , is not a multilinear function of the uncertain 
parameters. Substituting for ct and to in the above transfer function yields the following equation: 

s 2 y = u - 2 ct q sy - to Q 2 y - 28 a sy - ( 2a> o 8 ffl + S^ 2 ) y 

In forming an M-A model, the problematic term is 8^ 2 because it is not multilinear. In order to represent this 
equation in multilinear form, the following dependent variable is defined: 

8 , = 8 

3 to 

so that s 2 y = u - 2CT Q Sy - to Q 2 y - 28 a sy - (2®^ + 8^8^ 

Then tfie following equations can be defined: 


Qj = -2 sy 

q 2 = -y 

% - ‘ 2< V * P 2 


P 1 - 8 <, q l 
p 2 - 5 co q 2 

p 3 = 8 3 q 3 


Thus: 


s 2 y = u - 2 ct sy - to 2 y + p, + p_ 
3 o 3 o 1 3 

The realization of M(s) for the resulting M-A model can be depicted as follows: 


Xl 


-X2- 



0 

-cog 


1 

-2a 0 


xi 

LX2 


0 0 0 

L 1 0 1 J 


Pi 

P2 

P3j 


'qf 


0 -2 


" Y 1 


"0 0 0" 


Pl' 

q2 

= 

-1 0 


V/» 

+ 

000 


P2 

_q3_ 


. -2co 0 0 _ 


L A 2 J 


-0 1 0- 


-P3 _ 


y = [ i o] 


xi 

LX 2 J 


The A matrix is given by A = diag [ 8^ , 8^ , 83 ], where 8^ = 8^. 


These examples illustrate the proposed procedure for forming a minimal M-A model of an uncertain system. 
Although all the steps involved in obtaining these results have not been included, the stated results should provide a 
guide in performing the steps of the proposed procedure. It should be noted that, for ease of hand computation, the 
examples included only the simplistic (and less realistic) case in which the coefficients themselves are the uncertain 
parameters. However, it is emphasized that the proposed procedure does handle the more realistic case in which the 
uncertain transfer function coefficients are multilinear functions of the uncertain parameters. 
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8. Concluding Remarks 


This paper has presented a proposed procedure for forming a minimal M-A model of an uncertain system 
given its transfer function in terms of the uncertain parameters. The uncertainty class considered in this paper allows 
the transfer function coefficients to be multilinear functions of the uncertain parameters, and the uncertainties may arise 
in the A, B, C, and D matrices of the system model. The proposed procedure involves realizing the system in a 
cascade form, determining the minimal A matrix of uncertain parameters, and obtaining a state-space model for the 
nominal system, M(s). Three examples were given to illustrate the proposed procedure. The first example had eleven 
independent uncertain parameters, which arose in the A, B, C, and D matrices of the system realization. The second 
example had uncertain parameters arising in the A, C, and D matrices only. This example illustrated the formulation 
of a minimal M-A model for a system with inseparable real uncertain poles, and involved repeating an uncertain 
parameter in the A matrix. The last example had uncertainty in the A matrix only, and illustrated a method for 
handling non-multilinear terms. 

Further work on the proposed procedure will be to include systems having a nonmonic characteristic 
polynomial with an uncertain leading coefficient, as well as systems having an inner feedback loop which may or may 
not have uncertainties. The latter case may require a modification in the formulation of the cascade realization. 
Although the procedure presented in this paper is for SISO systems, an extension to MIMO systems will be 
forthcoming, and should primarily involve modifying the cascade-form realization. Other areas of future work include 
development of a simple means of verifying the minimality of a given M-A model, and development of a method of 
reducing a nonminimal M-A model to a minimal form. 
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Abstract 

The Space Station Attitude Control System software test-bed provides a 
rigorous environment for the design, development and functional 
verification of GN & C algorithms and software. All Space Station systems 
and sub-systems that are controlled or monitored by the GN &, C software 
are simulated. The simulation presents a major computational challenge, 
starting from the simulation of full nonlinear flexible body dynamics 
including the orbital environment and Mobile Servicing System (MSS) 
operations, to task scheduling, sensor dynamics and inter-module 
communication. In addition, the complex tasks of providing flight algorithm 
sequencing and control and input command validation needs to be 
addressed. 

This paper describes the approach taken for the simulation of the 
vehicle dynamics and environmental models using a computationally 
efficient algorithm. The simulation includes capabilities for 
docking/berthing dynamics, prescribed motion dynamics associated with 
the Mobile Remote Manipulator System (MRMS) and microgravity 
disturbances. The vehicle dynamics module interfaces with the test-bed 
through the centroal Communicator facility which is in turn driven by the 
Station Control Simulator (SCS) Executive. The Communicator addresses 
issues such as the interface between the discrete flight software and the 
continuous vehicle dynamics, and multi-programming aspects such as the 
complex flow of control in real-time programs. Combined with the flight 
software and redundancy management modules, the facility provides a 
flexible, user-oriented simulation platform. 
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Abstract 

Powell’s nonlinear programming code, VF02AD, has been used to generate approximate 
minimum-time tip trajectories for 2-link semi-rigid and flexible manipulator movements in the 
horizontal plane. The manipulator is modeled with an efficient finite-element scheme for an n-link, 
m-joint system with horizontal-plane bending only. Constraints on the trajectory include bound- 
ary conditions on position and energy for a rest-to-rest. maneuver, straight-line tracking between 
boundary positions, and motor torque limits. Trajectory comparisons utilize a change in the link 
stiffness, El, to transition from the semi-rigid to flexible case. Results show the level of compliance 
necessary to excite significant modal behavior. Quiescence of the final configuration is examined 
with the finite-element model. 

Introduction 

Trajectory planning is essential in budgeting a manipulator’s actuator efforts to maximize 
productivity. For repetitive tasks, the minimum-time maneuver goes hand-in-hand with this goal. 
A variety of approaches have been advanced for rigid manipulator control, taking advantage of the 
fact that all or some of the controls take the form of switching functions between actuator bounds. 
Bobrow [1] used an intuitive approach to generate optimal switching controls, as well as proving 
the boundedness of the controls. Weinreb and Meier [2][3] used calculus of variations approaches 
to incorporate control bounds in the problem formulation. In a second study, Bobrow [4] used 
numerical optimization to generate spline fits to the switching controls. 

Switching functions do not lend themselves to maintaining tip accuracy for non-rigid struc- 
tures. One would hope that the applied controls do take advantage of the bounds to maximize 
performance, but a clear analytical directive for this does not exist at the present. 

In filling this void, parameter optimization techniques can provide approximate optimal perfor- 
mance solutions for systems driven by complex, highly nonlinear dynamic models with arbitrary 
equality or inequality constraints. Of these solution techniques, Powell’s Recursive Quadratic Pro- 
gramming algorithm [5], embodied in the code VF02AD, has proven to be a robust tool for a variety 
of aerospace applications [6][7][8], and will be used in this study. The primary drawback to this or 
other numerical optimization methods is the dependancy on accurate gradient approximations of 
the performance index and constraints with respect to the parameters. 

The ensuing discussion initially describes the structural dynamics model of the manipulator, 
followed by the optimal control problem and parameterization of the controls, and ends with the 
results of a computational experiment. 

The manipulator structure modeled in this study, and presently under fabrication, is a 2-hnk 
cantilever arrangement constrained to slew in the horizontal plane. Tall, thin links are used to 
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minimize vertical plane droop. The hub or joint- 1 actuator slews both links, an interlink motor, 
and tip payload. The interlink or joint-2 actuator located at the end of link-1 slews the second 
link and the tip payload. The joint- 1 /joint-2 actuator torque ratio is about 4/1. The complete 
manipulator is about 0.5 meters (m) tall and 1.2m long. 

The Structural Model 

There is a long literature discussing the difficulties of simulating the vibrations of rotating 
structures[9j [10] [11]. The problem seems to arise from kinematics that are of second order impor- 
tance in nonrotating problems, but become of first order importance in the presence of rotational 
accelerations. Additonally, there are constraints inherent to the flexible link problem which must 
be satisfied: motions occur entirely in a horizontal plane; one end of the chain of links is attached 
to a stationary hub; and each flexible link is inextensible. These considerations motivate the de- 
velopment of a mathematical model that faithfully carries the full kinematics of the problem. It is 
also necessary to devise such a model in a form that will lend itself to real-time calculations. 

The need to meet these apparently conflicting demands motivated the development of a model 
specialized to flexible, multiiink structures. That apparently successful strategy is outlined below. 
The full kinematics are retained by expressing the configuration as functions of convected coordi- 
nates. This is a traditional approach in nonlinear elasticity [12]. Further, the kinematic variables 
are selected so that all geometric constraints (fixed hub, planar motion, and non-extension) are 
automatically satisfied. 

Since motions are assumed to occur entirely in a plane, it is also assumed that the elastic lines 
of the links as well as the mass centers of the cross sections all lie in the same plane. Each cross 
section is identified by its arc-length distance from the hub, so that the orientation of the center of 
the cross section s at time t is 


= cos(9(s y t))t + «n(0(j,t))J 

The location of the center of cross section s at time t is obtained by integration of the above unit 
tangent vector: 

*(M)= / ^{s 9 } t)ds 9 

Jo 

Similarly, the velocity at the cross section s at time t is obtained by integration of the time derivative 

of 0{s,t): 

*(M) = f d{s\t)l(3 9 ,t)d3 9 

Jo 

where 

f(M) = -«n(0(j,t))r+coa(0(j,f))/ 

The above description of configuration - entirely in terms of 0(s, t) - causes all of the geometrical 
constraints to be satisfied automatically. Additionally, the above description expresses the configu- 
ration in terms of one unknown field (0), instead of the more conventional two or three fields (x v 
Sc0). 

The governing equations of the dynamics are derived using those kinematics and a frame- 
invariant variational method - Hamilton’s principle. A finite element discretization is used to cast 
the resulting integro-differential equations for 0(s, t ) and its first and second derivatives into a sys- 
tem of fully- coupled, nonlinear algebraic equations. Particularly important for the application at 
hand is the observation that since all spatial integrals are with respect to the convected coordinate, 
s, those integrals are configuration-independent and need be done only once. The nonlinearities 
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remain, and a new nonlinear system must be solved at each time step, but the time consum- 
ing quadrature process can be done in advance of the dynamics simulation. These features are 
illuminated below through derivation. 

Hamilton’s principle is that 

6 f h [. KE(t ') - SE(t') + WE(t')] dt ' = 0 (1) 

Jh 

where KE is kinetic energy, SE is strain energy, and WE is external work. Quantities associated 
with the end times, t x and 1 2 have to do with initial and final conditions and the conservation 
of momenta, but the governing equations necessary to model motion of the flexible structure are 
obtained by consideration of the integrand alone: 

6KE(t) - SSE{t) + SWE(t) = 0 (2) 


for all t x < t < t 2 . 

The kinetic energy is that of the flexible links plus that of all concentrated masses and concen- 
trated moments of inertia: 


j ^ i -1 maaaea . inertia* 

KE i t )=~ p{a)x{a,t)-x{a,t)ds + - ^ M k x(a k , f) • £(a k , t) + - ^ hO(ai,t) ■ 0(a h t) 

ZJ0 1 k=i 1 /=i 


jfe=l 

The strain energy is that of the flexible links: 

rL 


s m = \l 


Discretization of the above energy terms is obtained by discretizing the tangent vector /3 as: 

node* 

0 = £ A*(0Pn( J ») 

n— 1 

where the shape functions, p n , have support over intervals that are small relative to the anticipated 
radii of curvature. The above condition on the support of the basis functions is necessary to assure 
compliance with the condition of nonextension. The resulting energies are: 

1 nodes node a 

*£(0=2 £ £ (3) 

m=l n= 1 

and 

1 node* node a 

= 2 E E A-(') •&(<) Jfm,» (4) 

m=l n=l 

where 


j>£, trier iia* 

Mm.n = / p(a)q m (a)q n (a)da + ^ M k q m (a k ) q n (a k ) + £ Ii6 K (l,m)6 K (l,n) , 

Jo k = 1 f= 1 

9 m(j) = / Pm(a)di , A'm.n = / k (a) p' m (a) p' n (a)da , 

Jo Jo 

is the Kronecker delta function, s is a dummy variable, and p^ n are spatial derivatives. 
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After appropriate integration by parts, the integrand of equation 2 becomes: 

node j 

Y. *M-7m(0 ■ n - 1 m(i) * Pn{t)K m<n + 7 m(0 * ) = 0 (5) 

n=l 

for all nodes m. In the above equation, r n is the external torque applied at node n. After /3 n (f) is 
expanded: 

£„(*) = <U<)7n(0 + (^„(t)) J 4n(t) 

and Equation 5 is invoked for all 69 m , a complete set of nodes second order equations in the nodes 
unknowns, 6 n , results as follows: 

node* 

Y 7m (0 • 7n(0^n(0^m,n ~ 7 m(0 ' A»(0(^n(0) 2 -^m,n + 7m(0 ‘ ^n(0^m,n = 7m (0 * ^m(<) (6) 

n=l 

The above problem formulation, involving only one unknown field, automatically satisfying all 
constraints, and requiring only one evaluation of element mass and stiffness matrices, lends itself 
to rapid numerical calculation. A computer code to generate and solve the above described system 
of equations for each time step has been written, tested and used by VF02AD. Both the derivation 
and code mentioned above are described more fully in Ref. [13]. 

One particularly interesting calculation, discussed in the literature as being difficult to solve, is 
that of a beam accelerated around a hub to an angular frequency above the first bending frequency 
of the beam. This is a particularly stringent test of flexible-dynamics codes, testing numerical 
robustness, as well as the correctness of the physics. Figure 1 shows the motion of the tip of such a 
flexible beam relative to the tip of a rigid beam rotating at the hub velocity. Initially, the flexible 
beam lags its rigid counterpart; it then overtakes and oscillates about the rigid beam. These results 
are in near exact agreement with the results presented in [9], obtained from a much more complex 
model. 

Optimal Trajectory Shaping 

The principle goal in this study is to combine the physics of the structure with optimiza- 
tion techniques to generate actuator torque histories for accomplishing a useful task with minimal 
degradation in performance. A secondary objective is to shortcut the work of an eventual feedback 
controller, which will be needed to compensate for modeling errors. 

Looking towards maximizing productivity in some repetitive task, a minimum-time tip tra- 
jectory was chosen for investigation. Constraints on such a trajectory include: completing a 
rest-to-rest maneuver, tracking a specified path (®(f),y(f))tt P , slewing between specified endpoints 
[{x{to)y v{to))y (*(*/), y{tf))]ti P y and not exceeding actuator torque limits ri f3m „ . 

The configuration initially starts at rest. Driving a flexible structure to rest at the final time, 
*/, necessitates end constraints on both kinetic and potential or strain energies (KE(tf), SE(tf)). 
The chosen path is a straight line and actuator torques limits are constants. The problem can be 
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restated as 


minimize: J — tf 

- finite element model 

subject to: - input actuator torques, Ti | 2 (t) 

- known initial conditions 
constrained by: 


- 


x tip{tf) x tpecified{tf) 

- 




~ V •pecificd(t f ) 

= 


<?;(*/) 


Jo' [yt»p(®t»p(0) — y/tne(*e»p(0)] 

= 

0 

= 

KE{tf) 

= 




SE(tf) 

— 




/o'(ln(0l - 

Jo'tMOI-buJ)* J 

< 



j= 1,7 

< 

. 


Note that the equality tracking constraint, C 3 , and inequality torque constraints, C 6 ,C 7 , are for- 
mulated as integrals. In addition, equality constraints on energy are point constraints. Both of 
these items will have profound effects on the example trajectories to be generated. 

Parameterization of the Controls 

To approximate optimum system performance from the aforementioned structural model, a 
suitable parameterization of the controls, ti i2 (£), is necessary. The choice of “parameterizable ” 
torque functions is essentially limitless. For this study, the simplest case of using tabular values of 
torque, r», as parameters at equal-spaced fixed times, ti 7 for both joints was chosen, or 

Ti{ti),T 2 {ti ), *=l,n 0<ti< t fj 

which results in 2 n control parameters. 

However, since the final time is changing due to minimization, the loss of control history def- 
inition would result if the times at which the control parameters are defined remain fixed in an 
absolute sense. To correct this, ti | 2 were specified at equally- spaced, nondimensional node points, 
Ci = ti/tfj where 

Ti(Ct)> r 2 {Ci)j i = l,n 0 < & < 1, 

This allows the torque histories to “stretch” naturally over the trajectory length. Using this mod- 
ification, it is necessary to add tf as a parameter also, resulting in 2n -f 1 control parameters to be 
found. 

Numerical derivatives of the performance index, £/, and the constraints, Cj(t/), provided to 
VF02AD are central finite-difference approximations. To generate derivatives with respect to a 
torque parameter, Ti j2 . = T 1|2 (tj), the parameter is perturbed equidistant about its nominal value, 
and complete trajectories (or integrations of the structural equations) are computed to the current 
nominal tf to produce perturbed Cj(tf) values. Since derivatives are computed over the current 
fixed ty, the derivatives, dtf/dri f2i ~ 0, and only the derivatives, dC j{t f) / dr\ %2i / 0. Naturally 
both tf)Cj(tf) gradients with respect to £/, evaluated over the current nominal torque histories, 
are nonzero, (where dtf/dtf — 1.). 

Results 

The following finite- element structural model was used for the manipulator to produce the 
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sample trajectories. 


ITEM 

COMPOSITION 

LENGTH 

(m) 

MASS 

(kg) 

STIFFNESS, El 
(newton-m 2 ) 

joint -1 bracket 

1 element 

.0635 

.545 

10* 

link-1 

3 elements 

.5040 

.640 

10 2 ,10 3 

1st joint- 2 bracket ± 

1 element ± 

.1070 

5.415 

10 B 

joint-2 

2nd joint- 2 bracket 

1 point mass 
1 element 

.1040 

.830 

10 B 

link 2 

3 elements 

.4890 

.313 

10 2 ,10 3 

Totals: 

9 elements 

1.2675 

7.743 



The two values of stiffness, EI y for links 1,2 represent the trajectory comparison for this study. 
The brackets, modeled with a stiffness of 10 B , are considered rigid. Point moments of inertia were 
used to define mass distribution for the brackets. No payload was used in this comparison. The 
joints were assumed to have no compliance or damping. 

The two trajectories, computed on a CRAY-XMP, were integrated for 100 time steps, where 
A t = ,01£/. Trajectory evaluations for gradient computations executed in 0.75 secs. The torque 
histories for each joint were composed of 21 tabular values, where A£ = .05. Torque bounds were 
chosen as ±16, ±4 newton-m (n-m) for joints 1 and 2. The path to be tracked for this study was 
the line connecting («,y) pairs, (0.0,1.13) and (1.13,0.0). A composite of the slew motion for the 
“flexible” case ( EIunk § = 100 newtons-m 2 ) is given in Fig.2. The parameterized torque histories 
that created this slew represent 100 iterations of VF02AD after initialization with the parameter 
solution values from the “semi-rigid” case ( Elunk # — 1000 n-m 2 ). The tip path traced is reasonably 
straight, but does contain some small ripples - a result of the integral statement of the tracking 
constraint, C^tf). 

Figure 3 graphically depicts the difference between the semi-rigid and flexible links. Shown 
is the angular velocity of the finite-element node adjacent to joint 1. The frequency of vibration 
for Elunkg — 100 is about 19 hz. From examination of Fast Fourier transforms (FFTs) of the 
finite-element output of the system disturbed about the initial, midtime, and final positions, this 
appears to be one of the lower modes. Note the low angular velocity of the semi-rigid system after 
t/tf = 0.9, implying that a significant amount of time is being expended in order to bring the 
system to “rest”. This phenomena is definitely at odds with purely rigid system behavior. The 
KE(tj) = 0 constraint imposes a zero final angular velocity in both cases. Also, note that tf for the 
flexible case is slightly greater than the semi-rigid one. This demonstrates the approximate nature 
of the solutions, insofar as the semi-rigid solution not converging as well. 

Fig.4 shows the tj profiles. These still retain some of the boundedness qualities of purely rigid 
configurations. However, they begin and end near zero instead of the bounds (±16 n-m), and the 
transition between bounds is comparitively gradual. The oscillatory behavior in the EIu n k 9 — 100 
torque is probably counteracting the excitement of the lowest structural modes, which would have 
the greatest impact on the position constraints. Note the abruptness of the controls near the end 
in an attempt to quiet the structure. The r? torques in Fig.5 show minimal activity for most of 
the trajectory, except close to the end in order to accomplish the rest state. Note that this closing 
maneuver starts sooner with the more flexible link structure. 

The straight-line tracking error in millimeters (mm) is shown in Fig. 6. Both torque histories 
appear to limit the error to ±5 mm except near the end of the EIu n k s = 1000 trajectory, where 
the error momentarily “escapes” before returning to zero to satisfy the end position constraints, 
( Cx(tf ), Cjft/)). One drawback to the integral formulation is that it can relax tracking performance 
in isolated parts of the trajectory, yet yield a reasonably low residual (« 0) for Ca(</). It may be 
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necessary to add interior point constraints to significantly decrease this error. A less complex 
alternative is to reduce the EIu n k 8 more gradually between solutions, in the fashion of a liomotopy 
scheme, until the desired value is reached. 

The kinetic energy of the flexible structure at peak rates (t/tf « 0.45) is, not surprisingly, 
higher than the semi-rigid one as shown in Fig. 7. It is interesting that KE appears to be devoid of 
oscillatory behavior in both cases. Note again, the rest phase of the trajectories above t/tf = 0.9. 
Strain energy is shown in Fig. 8. This again displays the contrast first seen in Fig. 3. The semi-rigid 
structure produces relatively little strain, while the flexible-link configuration again contains the 
19 hz mode with a sizable increase in energy magnitude. Note the peaks in both cases, mirroring 
the sharp 7* changes in Fig. 4. Also, note the enforcement of the SE(tf) = 0 point constraint at the 
end. The SE “floor” in both cases corresponds to the KE peaks in Fig.7. 

The finite-element model was also able to examine the quiescence of the final state. At t/, joint 
node accelerations can be zeroed and joint node positions fixed at their nominal final values. The 
corresponding torques applied to maintain these values can then be computed. Fig. 9 shows the 
residual T\ being applied after the approximate optimal tf — 1.505 secs for EIu n = 100 has been 
reached. The oscillations in this residual appear to contain frequencies in the vicinity of 2.5 and 35 
hz. The lower appears to correspond to a fixed-free mode of the first link with the remainder of the 
system mass concentrated at the end. This mode was not seen during the optimized part of this 
trajectory. The second frequency may be a higher harmonic of the mode seen earlier, or possibly 
a numerical artifact. The small magnitude of this residual coupled with the spacing of the modes 
may be ideal targets for attack by a linear feedback control scheme. As an open-loop alternative, 
augmenting the tabular torque controls with frequency- based, parameterized functionals may be 
sufficient to suppress these oscillations. 

Conclusions 

A robust, parameter optimization tool has been able to generate actuator torque histories 
for approximate, minimum time slewing maneuvers containing a variety of continuous and point 
constraints for a 2-link flexible manipulator. The parameters, or actuator torques, for each link 
were tabular values at fixed node points during the maneuver. Perturbations were made to each 
parameter to approximate final time and constraint gradients. The efficient formulation of the 
finite-element model made the numerical optimization procedure a realistic endeavor. 

The accuracy of the straight-line tip tracking was good. Additional interior constraints and/or 
a more gradual change in the stiffness should yield further improvements. For the trajectory used 
in this study, joint-1 applied most of the input, which included cancelling lower-mode vibrations 
for the structural as a whole. Energy constraints were effective in bringing the structure to rest at 
tf • It was also demonstrated that final energy constraints do not preclude vibrations during the 
slew. The intended production use of the manipulator will dictate whether this is a hindrance or 
not. Finally, the secondary goal of providing enough vibration control to maximize the success of 
a linear feedback controller in treating residual oscillations appears feasible. 
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Figure 1: Single beam relative tip motion 


NORMALIZED TIME TA r 

Figure 3: Joint 1 node angular velocity 
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Figure 2: Composite motion for Elunkt = 100 
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Figure 4: tj joint torque histories 
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Figure 5: tj joint torque histories 
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Figure 6: Straight-line tracking error 
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Figure 7: Kinetic energy histories 
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Figure 8: Strain energy histories 
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ABSTRACT 

Conventionally kinematical constraints in multibody systems are treated 
similar to geometrical constraints and are modeled by constraint reaction 
forces which are perpendicular to constraint surfaces. However, in 
reality, one may want to achieve the desired kinematical conditions by 
control forces having different directions in relation to the constraint 
surfaces. In this paper the conventional equations of motion for multibody 
systems subject to kinematical constraints are generalized by introducing 
general direction control forces. Conditions for the selections of the 
control force directions are also discussed. A redundant robotic system 
subject to prescribed end-effector motion is analyzed to illustrate the 
methods proposed. 


1. INTRODUCTION 


In many applications of multi body systems certain points are desired to 
follow prescribed paths, such as the end-effector in a robotic system. 
Such kinematical conditions are treated constraint equations to 
determine the system motion and the control forces. 

In this paper those constraints which arise from geometrical 
restrictions such as closed loops and physical guides are termed 
geometrical constraints. On the other hand, kinematical constraints are 
defined as those conditions which represent desired motions or desired 
paths of certain points or bodies. 

In the conventional methods of analysis, regardless of the 
fundamental dynamic equations (Newton-Euler, Lagrange, Kane, etc.) used, 
the constraints are modeled by constraint reaction forces which are 
perpendicular to the constraint surfaces. (See Arnold [1], Hemami and 
Weimer [2], Kamman and Huston [3], Wehage and Haug [4], Nikravesh [5], 
Kim and Vanderploeg [13], Amirouche and Jla [6].) 

However kinematical constraints do not have to be satisfied by 
constraint reaction forces, and usually have to be realized by control 
forces applied by the actuators In the system. Hence the conventional 
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solution procedure imposes an arbitrary restriction to the directions of 
the control forces. Depending on the places of the actuators in the system 
one may want to achieve the desired kinematical conditions by control 
forces having different directions in relation to the constraint surfaces. 

In this paper the conventional equations of motion are generalized by 
inroducing general direction control forces for kinematical constraints, 
that replaces the constraint force representation. And the dynamic 
equations for multibody systems subject to geometrical and kinematical 
constraints are developed. By the proposed method of solution the control 
forces and the system motion are solved simultaneously. 

This paper is divided into five sections. After the introduction, the 
second section outlines the conventional equations of motion for 
constrained multibody systems. In the third section the general direction 
control forces for kinematical constraints are Introduced and the 
conditions for the control force directions are discussed. In section four 
simulations of a redundant manipulator by the conventional and the 
proposed methods are presented. Conclusions form the last section. 


2. CONVENTIONAL EQUATIONS OF NOTION 


Consider a mechanical system where qi , — ,qn are a set of generalized 
coordinates chosen for convenience to specify the configuration of the 
system. Let the system be subject to c constraints. Kane’s equations for 
an arbitrary system of particles and rigid bodies can be expressed as 
(Kamman and Huston [3], Baumgarte [7]), 


F* + F + F c = 0 (1) 

where 

F£ = A, 1=1,. ...c , p=1,...,n (2) 

*y P 

F* , F and F c are the vectors of generalized inertia, external and 
constraint forces respectively. In equation (2) fi=0, 1=1,..., c are the 

constraint equations in the acceleration level, A, are undetermined 
multipliers, and yi,...,yn are the generalized speeds of the system chosen 
for convenience as independent linear combinations of q P . The 
tranformation between q P and y P , e.g. Euler angle derivatives and relative 
angular velocity components can be expressed as 

qh = Th P y P h,p=1 , . . . ,n (3) 

where Th P are functions of q P (Kane and Levinson [8]). 

The generalized inertia forces can be expressed in the following form 
(Huston and Passarello [9]), 
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F* = M y + Q 


(4) 


where M Is the generalized mass matrix of the unconstrained system being 
functions of q p , and Q contains the quadratic velocity terms. 

The holonomic and nonholonomic constraint equations can be expressed 
in the acceleration level as below 


Bi P y P = hi i = 1 , . . . ,c 


(5) 


In eq. (5) B is cxn constraint matrix, and Bip and hi are in general 
functions of q P and y P . 

Note that for holonomic constraints (Jh (q P ,t)=0. 


Bip = Thp 

and for velocity level nonholonomic constraints vi (q P ,yp ,t)-0, 


>i P 


ay P 


Then, using eq. (2), the generalized constraint forces are 


F c = B T A 


( 6 ) 


The undetermined multipliers Ai represent the restraining constraint 
forces and moments generated by the constraints at the points of 
application (Ider and Amirouche [11]). 

Equations (1) and (5) represent the governing dynamical equations. 
Combining these and making use of equations (4) and (6), we have 


M 

B T 

y ' 


F-Q ‘ 

. B 

0 . 

A. 


h 


(7) 


The accelerations obtained from eq. (7) are then used for numerical 
integration for the time history of y P and, through the use of eq. (3), 
the generalized coordinates q P . 

Lagrange multipliers could be eliminated to reduce the equations for 
computational efficiency. To this end let C represent a nx(n-c) matrix 
which is orthogonal complement to B, obtained either by Singular value 
decomposition (Singh and Likins [12]), Zero eigenvalue method (Kamman and 
Huston [3]), Q-R decomposition (Kim and Vanderploeg [13], Amirouche and 
Jia [6]), or row equivalence transformation (Ider and Amirouche [10]). 
Premultiplying eq. (1) by C T yields 

C T (F*-F) = 0 (8) 

Combining equations (8) and (5) and utilizing eq. (4), we obtain the 
reduced equations 
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( 9 ) 


' C T M 

. 

' C T (F-Q) ■ 

B 

y r 

h 


Equations (9) and (3) form a set of 2n first order ordinary differential 
equations that can be numerically integrated to obtain the time history of 
y P and qp. 

When relative joint coordinates are selected as the generalized 
coordinates and the corresponding partial velocity vectors are developed 
using recursive multibody kinematics (Huston and Passarello [9], Ider and 
Amirouche [10]), constraint equations for joint connections are 
automatically eliminated. Hence, in this paper an open tree-like system 
represents an unconstrained system where n is the total number ofthe free 
joint degrees of freedom. 


3. CONTROL FORCES FOR KINEMATICAL CONSTRAINTS 


Now consider that a tree-like multobody system is subject to geometrical 
and kinematical constraints. Kinematical constraints represent desired 
motions or desired paths of certain points or bodies. They are the 
conditions that have to be realized by the actuators in the system. The 
desired motions could be specified at position, velocity or acceleration 
levels and could be holonomic or nonholonomic. 

Whether one uses Newton-Euler, Lagrange or Kane’s equations or other 
variations of these, in the conventional approach the constraints in the 
system are modeled by constraint reaction forces which are perpendicular 
to the constraint surfaces. In the case of geometrical constraints 
perpendicular reaction forces at the application points are generated, 
hence the above approach is necessary. On the other hand kinematical 
constraints could be achieved by a number of alternative control forces 
whose directions in the generalized space can be be selected by physical 
considerations. Therefore the conventional equations of motion should be 
generalized by considering general direction control forces for 
kinematical constraints. 

Consider c constraint equations (5), and let ci of the constraints in 
the system be geometrical and the remaining cz (c 2 =c-ci) be kinematical. 
The constraint matrix B and the vector of constraint force magnitudes A 
can be partitioned such that 

B = [B qT B* t ] t (10) 
and 

X = [X° T A kT ] t (11) 

where B G is a cixn matrix, B K is a C 2 matrix, A G is a ci dimensional 
vector and A K is a C 2 dimensional vector. 
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Addition of control forces to the equations of motion yields 

M y + Q + B gT X g + B kT A k + A T (^ = F (12) 

where A is a C 2 xn control force matrix where each row represents the 
direction of the control force for each kinematical constraint in the 
generalized space, and h is C 2 dimensional vector of control force 
magnitudes. Now assume that the control force directions and magnitudes 
are selected such that the restraining constraint forces \ G become zero. 
This leads to 

M y + Q + B gT X G + A T /u = F (13) 

Eq. (13) can be written in the following form 


M y + Q + Z T 1/ = F (14) 


where Z is the augmented matrix of constraint and control force directions 
Z = [B gT A t ] t (15) 

and is the vector of constraint and control force magnitudes, 

1/ = [\° T /* T ] T (16) 

Once the control force directions A are selected by physical 
considerations eq. (14) needs to be solved together with eq. (5) to 
determine the control force magnitudes simultaneously with the generalized 
accelerations. Hence the augmented equations of motion are 


‘ M 

V 


' y ‘ 


'F-Q ' 

. B 

0 . 


y . 


. h 


Alternatively the equations could be reduced in a manner similar to 
Section 2. To this end, let C be a nx(n-c) matrix orthogonal complement to 
Z. Premultiplying eq. (14) by C T and augmenting with eq. (5) leads to 


£ t m 

w — 

' C T (F-Q) ' 

B 

y - 

h 


In the case when the reduced equations are used v can be obtained from eq. 
(14) utilizing the computed accelerations, as 

V = (Z ZT)' 1 Z (F-My-Q) (19) 
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Note that A should be selected such that rank Z-c, because otherwise there 
will be less than c 2 control forces to control c 2 kinematical conditions. 

The augmented equations have a solution if and only if the augmented 
mass matrix is nonsingular, in which case the prescribed conditions are 
achieved with the corresponding control forces. On the other hand if it is 
not physically possible to realize the kinematical constraints with the 
selected control force directions, this reveals itself as a singular ( or 
near singular) augmented mass matrix. Therefore the condition for the 
existence of solution could be expressed as follows: Directions A should 
be chosen such that a linear combination of the rows of C T M should not be 
a linear combination of the rows of B. In other words the vector space 
spanned by the rows of C T M and the vector space spanned by the rows of B 
should be nonintersecting (except the zero vector). Since the dimension of 
the vector space spanned by the rows of B is c, and that of C M is n c, 
the possibilities for C T M are infinitely many (provided that n-c>0). Hence 
one can construct various vector spaces for C T M by different selections of 
the control force directions A. C T M that correspond to directions B is 
only one of them. 

For redundant systems, i.e. when c<n, it has been observed that one 
usually has several physically meaningful control force directions to 
satisfy the given kinematical conditions. 

In the special case when A is selected such that its rows are linear 
combinations of the rows of B K , then since C is the same as C in the 
conventional model, y becomes the same as the conventional case and Z v 
becomes equal to B T X- However i/i may be different than Ai depending on A. 

Similarly for nonredundant systems, i.e. n=c, B is nxn, and rows_of Z 
are necessarily linear combinations of the rows of B. In this case C is 
null matrix and the above procedure reduces to the conventional method 
where Z T v is equal to B T \. 

It should be noted that C T M and B may form non intersecting vector 
spaces even if C T and B do not. Hence realization of the prescribed 
motions is possible even in the extreme case when a control force 
direction is tangent to the corresponding constraint surface. This is due 
to the inertia coupling between the generalized coordinates. 


4. SIMULATIONS OF A REDUNDANT ROBOTIC SYSTEM 


In the three link manipulator shown in Figure 1, the configuration of the 
system can be described by three generalized coordinates 9i , 02, 03. The 
generalized speeds yp are defined as 

yi = 0i , y 2 = ©2 , y3 = 03 

The data used are Li=L2=L3=1.0m, mi=30kg, m2=m3=18kg, Ii=10 kg.m 2 , 
12=13=8 . 64 kg.m 2 . 

The end-effector (point A) is desired to move in the horizontal 
direction with a constant velocity v A . Hence the constraint equations In 
the system are 
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Figure 1. Three link manipulator 


Lici + L2C12 + L3C13 = v A t + 1.9088 

( 20 ) 

Lisi + L 2 S 12 + L3S13 = 0.9893 

where ci=cos0i, ci 2 =cos( 0 i+ 02 ), ci3=cos(0i+02+03 ) , and similarly si=sin0i, 
si 2=sin(0i +02 ) , si 3=sin(0i +02+03 ) . At the acceleration level the 

constraint equations are given by eq. (5) where B and h are 


B = 


Li S1+L2S1 2 + L3S1 3 
Li ci +L2C1 2+L3C1 3 


L2S1 2+L3S1 3 
L2C1 2+L3C1 3 


L3S1 3 
L3C1 3 


( 21 ) 


and 


h = 


’ “Li si yi 2 -L2Si 2 (yi +y2 ) 2 -L 3 Si 3 (yi +y2 + y3 ) 2 
.-LiCiyi 2 -L2Ci2(yi+y2)2-L3Ci3(yi+y2+y3) 2 


( 22 ) 


Initially the system is at the configuration 0i=6O°, 02=-1O°, 
03^-90° . The initial generalized speeds are yi=0.386 rad/sec, y2=0, 
y3=-0. 9618 rad/sec, which correspond to v A = -1 m/sec. 
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First the system is simulated using the conventional method. The 
generalized constraint forces can be expressed from equations (6) and (21) 
as, 


Ft 


Xi (L1S1+L2S12 + L3S13) + X2 (L1C1+L2C12+L3C13 ) ' 

f! 


Xi (L2S1 2+L3S13) + X2 (L2C12+L3C13) 

F 3 


Xi L3 Si 3 + X2 L3CI 3 


In particular we wish to determine joint moments denoted as Mi, i-1,2,3 
that would achieve the desired kinematical conditions. Ft, i = 1 ,2,3 
represent the required joint moments. It is seen from eq. (23) that all 
three joint moments are nonzero, i.e. motors are required at all three 
joints. 



Figure 2. Displacements. 

Conventional method : 1.A0i, 2.A02, 3. A03 
Control force method: 4. A9i , 5.A02, 6. A03 
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Figure 3. Velocities. 

Conventional method : 1. 0i, 2. 02, 3. 03 
Control force method: 4. 0i , 5. 02, 6. 03 


The simulation is performed for 1 sec., until the end effector moves 
1m in -x direction. a0i and Q \ , i=1,2,3 are plotted in Figures 2 and 3 
respectively. The joint moments Mi required to obtain the desired motion 
of the end effector as obtained by the conventional method are shown in 
Figure 4. 

Second the system is resimulated by the proposed control force 
approach. As an illustration the control force directions are selected 
such that no moments are needed at the lower joint of link 1. The 
corresponding control force directions are 


A 



1 0 ' 

0 1 . 


(24) 


Note that since there are no geometrical constraints in this system, the 
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Time (sec) 



Figure 4. Joint moments. 

Conventional method : 1. Hi, 2. M 2 , 3. M 3 
Control force method: 4. M 2 , 5. M 3 


matrix Z is identical to A, and the vector v is identical to p . The 
control forces Z T v are 



0 ' 


0 


0 

Z T V - Vi 

1 

+ V 2 

0 

= 

V 1 


. 0 . 


. 1 j 


. V 2 


(25) 


Hence, in this case, the required joint moments for the prescribed end 
effector motion are Mi=0, M 2 =vi , M3=‘ , 2. 

The augmented mass matrix was observed to be full rank as expected. 
0i and 0i , i = 1 ,2,3 for a simulation of 1 sec. are plotted in Figures 2 and 
3. The joint moments are plotted in Figure 4. Note that in this case a 
motor is not needed at the lower joint of link 1. 
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Initial configuration 


Figure 5. Final configurations: 1. Conventional 
method; 2. Control force method. 


The system’s motion differ in the above two approaches although in 
both cases the end effector performs the same motion. The configurations 
at t=1 sec. corresponding to the conventional model and the control force 
model are shown in Figure 5. 


5. CONCLUSIONS 


This paper presented a general procedure for the dynamic modeling of 
multibody systems subject to kinematical constraints. General direction 
control forces have been introduced that replace the conventional 
constraint reaction forces, hence increasing the ways of realization of 
the prescribed motions. It is shown that the possible conrol force 
directions are more than one, and the criteria for the existence of 
solution have been presented. 

The method proposed in this paper involves selecting the control 
force directions in the generalized space by physical considerations, and 
then solving their magnitudes simultaneously with the corresponding system 
motion. As a result one can design alternative control forces that can be 
applied by the actuators in the system. The method is expected to be 
especially useful to control the extra degrees of freedom in systems that 
have joint flexibility or joint clearance. 
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SYSTEM DESIGN FOR NONLINEAR DYNAMIC MODELS OF AIRCRAFT 
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ABSTRACT 

Control system design for general nonlinear flight dynamic models is con- 
sidered through numerical simulation. The design is accomplished through a 
numerical optimizer coupled with analysis of flight dynamic equations. In the 
analysis, the general flight dynamic equations are numerically integrated and 
dynamic characteristics are then identified from the dynamic response. The 
design variables are determined iteratively by the optimizer to optimize a 
prescribed objective function which is related to desired dynamic character- 
istics. Generality of the method allows nonlinear effects of aerodynamics and 
dynamic coupling to be considered in the design process. To demonstrate the 
method, nonlinear simulation models for an F-5A and an F-16 configurations are 
used to design dampers to satisfy specifications on flying qualities and con- 
trol systems to prevent departure. The results indicate that the present 
method is simple in formulation and effective in satisfying the design objec- 
tives. 

INTRODUCTION 

At high angles of attack, the aerodynamic forces and moments are, in 
general, time-dependent and nonlinear functions of motion variables. There- 
fore, the traditional control system design method based on a linearized 
dynamic system are not appropriate. In addition, the aerodynamic, kinematic, 
and inertial coupling phenomena are important to the high angle-of-attack 
flight dynamics of modern aircraft. As a result, a number of high angle-of- 
attack control concepts have emerged (refs. 1-4). Therefore, a suitable 
control system design method must be capable of incorporating these coupling 
phenomena with considerations of time-dependent, nonlinear aerodynamic forces 
and moments. A control system designed without considering these coupling 
phenomena often has a detrimental effect on the departure/spin resistance (ref. 

5) . Another feature of high-alpha control system is the simultaneous utili- 
zation of several control surfaces or devices. Therefore, a design method 
capable of handling multiple input and output is necessary. A current approach 
to solving this problem is by extensive piloted simulation (ref. 5). 

Methods in optimal control theory represent possible approaches to solving 
these problems under consideration. These methods are derived through calculus 
of variation. However, computational methods in existence require lineari- 
zation of dynamic equations and aerodynamics about trimmed conditions (ref. 

6) . Another alternative is to apply numerical optimization techniques without 
linearization as they are frequently used in structural and aerodynamic designs 
of large systems. A similar approach has also been used in other control 
applications in ref. 7. 
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In the present method, a numerical optimization technique based on conju- 
gate gradients and feasible directions (ref. 8 ) is coupled with an analysis 
method which is to obtain numerical solutions of the nonlinear six degree-of- 
freedom dynamic equations. This analysis method is to provide information 
needed in the design process, such as damping ratios, frequencies, motion vari- 
ables involved in dynamic instabilities, etc. Since the analysis method can 
deal with nonlinearities in the dynamics and the aerodynamics and with any 
general constraints on the control system configuration, the control system 
designed with a numerical optimization technique can be very realistic and 
effective. 


NUMERICAL APPROACHES 

Typically, a control system may be designed to enhance flying qualities, 
to prevent flight departure, and to have an effective maneuver control 
system. To demonstrate the present method, only the first two objectives will 
be considered. That is, one is to design dampers at a moderate angle of attack 
to satisfy specifications on flying qualities and the other to design a control 
system to prevent flight departure at high angles of attack in a maneuver. 
Numerical formulations to solve these problems are described in the following. 


Design to Satisfy Flying Qualities Specifications 

The general system of equations can be written as 

m(u - vr + wq) = mg + F A + F T 

x x 


m(v + ur - wp) = mg + F A + F T 

y y y 


m(w - uq + vp ) = mg + F A + 

z z 

I p - I r - I pq + (I - I )rq = L + L 

xx H xz xz*^ zz yy AT 

I q + (I - I )pr + I (p 2 - r 2 ) = M + M T 

yy H xx zz * xz AT 


- I...P + (I„„ - I_)pq + - n a + N ' 


zz xz 


yy xx 


A T 


J = p + q sin 4 > tan9 + r cos<t> tanG 
9 = q cos$ - r sin 4 > 

4 . = (q sin4> + r cos 4 )sec 0 
a ■ tan ^(w/u) 


sin \v//u^ + v^ + w 2 ) 


(la) 

(lb) 

(lc) 

(l d) 

(le) 

(l f) 

(lg) 

(lh) 

(li) 

1J) 

(lk) 
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where (u, v, w) are the three linear velocity components of the aircraft; (p, 
q, r) are the angular velocity components; and (<)>, 0, (J;) are the Euler angles 
in bank, pitch, and yaw, respectively. The subscripts (x, y, z) appearing on 
the right-hand side of Eqs. (la) - (lc) denote components in the corresponding 
coordinate directions; and (A, T) denote aerodynamic and thrust components, 
respectively, g is the gravitational acceleration, and F’s are the external 
forces, while (L, M, N) are the moments about the (x-, y-, z-) axes. In 
addition, m is the mass and l xx , I xz > etc., are the moments of inertia. The 

aerodynamic forces and moments (F A , L A , M A , N A ) , including the control effects, 

are represented in dimensionless coefficients in a tabulated form as functions 
of motion variables in this study* The motion variables are (u, v, w, p, q, 
r). 


This system of equations is numerically integrated from an initial state 
(usually a trimmed level flight condition) after disturbances (such as 
impulsive control-surface deflections) are imposed to generate time-history 
data of motion variables. 

For demonstrative purposes, it is assumed that dampers to provide flight 
characteristics satisfying flying-qualities specifications are to be deter- 
mined. This problem has been solved in the past by conventional methods, such 
as the root-locus method, by using linearized equations of motion. It is con- 
sidered here mainly to show the generality of the present method even without 
linearizing the equations of motion. In the present method, the necessary 
design information includes damping ratios, natural frequencies, and time con- 
stants of the vehicle motions. These characteristics are identified from 
calculated time-history results of motion variables after multiple-axis distur- 
bances are imposed. The numerical method used for parameter identification is 
the method of differential corrections described in the following. 

A general discretized system output in the time domain is assumed to be of 
the form: 


f(t k > 


n 
Z e 
i*l 


“^n^k 


(A^cosu^ 


+ B sinw. t, ) + 

1 1 k j=l J 


m “Vk 

£ C.e J + 


Dt k + 


( 2 ) 


2 

where t k * At(k - 1), k = 1, 2, . K, u>^ = /I - is the damped frequency 

of the mode. The objective is to use Eq. (2) to fit the dynamic response 
data [Q k = x(t k )] through the method of least squares to determine the damping 



j = 1, m. These parameters appear nonlinearly in Eq. (2). Other un- 

knowns, A^, B^, C j , D, and E, are linear parameters in Eq. (2). Because of 

nonlinearity, finding a solution of the resulting nonlinear algebraic equations 
from the least-square formulation is difficult. The best approach, as it has 
been determined in the present investigation, is the method of differential 
corrections. In other words, the unknown parameters are expressed as 


396 



c i 



+ AC , 


w = w + Au) , etc., i = 1, n, j = 1, m 

n i n i, n i 

k 

where C. > <*) , ... are the initial approximations of the i 1 "* 1 or j 1 "* 1 un- 

i k n i 
K 1 k 

knowns. Using Taylor series expansions, it is obtained that 

Q. + e. - f (t, ) + Z (AC. + Aw + AA + AB ^-) 

k k ok , t i 5C* n, doo i dA^ i SB^ 

1 = 1 I, i n, I. i t 


m 


+ . E . <4C j aF + ror’ - “ fr + 4E Hr + ••• 

J =1 Ju J. k k 


(3) 


where e R is the residual. The least-square method is then applied to Eq. (3) 
in such a way that 

K 2 

G - E = minimum 


k = l 


The differential corrections (AC ^ i etc.) which minimize the G-function are 

determined by setting the first derivatives, 9G/9(AC^) > etc, to zero. Once 

the differential corrections are determined, they are added to the initial 
estimates of the unknowns and the process is repeated to determine a new set of 
differential corrections until G is a minimum or until there is no significant 

change in the unknowns. Typically, convergence is assumed if G < 10 7 . 

After the necessary design information is obtained from the analysis part 
of the algorithm, the optimizer is called to perform the design process. 

The damper design problem here may be formulated as follows: find the 

pitch rate feedback gain K q , the roll rate feedback gain K p , the yaw rate feed- 
back gain K r , the lateral acceleration feedback gain K a y> and the aileron-to- 
rudder interconnect gain K ARI , such that the following objective function is 
minimized : 
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-B 1 , -B2 

r + : — — — rz — 


e + El x 

Tc r l~ 
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D i 
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e + E5 x [T - T I e + E6 x |T - T 
1 r. r 1 1 s, s 
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where £ 


, C , ^ _ > (jl) , T , and T are specified values to satisfy MIL- 
8p i p i D i n Dl r i s i 

F-8785B. £ y C y w , T , and T are values obtained in the analysis 

sp p D n^ t s 

part* In Eq. (4), Bi and Ei are some weighting factors, e in the denominator 
of the objective function is a small number used to prevent the objective func- 
tion from being infinite and is set to 10~^ in the present algorithm. The 
optimization problem is subject to constraints on magnitudes of damping ratios, 
frequencies, time constants, overshoot, etc. In the optimization process, the 
design variables are varied systematically by the optimizer to obtain numeri- 
cally the gradients of the objective function and constraints. These gradients 
are then used through the methods of conjugate gradients and feasible direc- 
tions to determine the appropriate design variables to minimize the objective 
function. The process continues until the objective function does not change 
and the constraints are all satisfied. 


Design to Prevent Flight Departure 

Again, Eqs. (1) are numerically integrated. During time integration, a 
certain maneuver flight is imposed to induce departure of the airplane. One 
example of the maneuver flight is to pull up the airplane (i.e., to increase 
the angle of attack) and then induce a high roll rate afterwards. The present 
algorithm is constructed on the assumption that a departure condition is 
identifiable from the magnitude of the state vector, or motion variables. 

Since the latter are directly obtained from time integration of Eqs. (1), no 
further data manipulation is needed to calculate the necessary design infor- 
mation. 

The design objective is achieved by first assuming a control system struc- 
ture. Then the design problem may be formulated for the demonstration cases to 
be presented as follows. 

Determine the aileron-rudder interconnect gain (K^j), the side acceler- 
ation feedback gain (K ), and the yaw damper gain (K ), etc., to 

ay r 

minimize the following objective function: 


OBJ * - 


^l^max ^2 a trim 


max 

C, 


3 4 

I + E I'Knoj + E 


max' 

C 8 


7(3 | + e 

1 *max 1 


r + e 
1 max' 




(5) 


subject to various constraints depending on applications. Note that Eq. (5) 
indicates that p (the roll rate) is to be maximized and a_„„ in the transient 

motion, (yaw angle), p max (sideslip) and r ffiax (yaw rate) are to be mini- 

mized. a trim ca l cu l ate d as t ^ le average angle of attack over the whole time 

period and may be used to define the limiting angle of attack to be discussed 
later for the F-16. Specific applications are discussed in the next section. 
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Two fighter configurations will be used to demonstrate the algorithm, one 
being an F-5A and the other an F-16. A pitch damper design for the F-5A will 
be considered first. Control to prevent flight departure will be discussed 
next. For illustrative purposes, all system gains in the following consider- 
ation are assumed constant. 


A Pitch Damper Design for an F— 5A 


The algorithm has been tested and found to work well at different flight 
conditions to design dampers under multiple input conditions. At low angles of 
attack, calculated results are found to be consistent with existing systems. 

To demonstrate this computational tool, consider designing a pitch damper at a 
Mach number of 0.3 and at an altitude of 10,000 ft. The corresponding a trim is 


determined to be 11.7 deg. 


Assume that a damping ratio of 0.65 


«. Pl > is re " 


quired in the longitudinal dynamic response of the short-period mode. The 
optimization problem may then be formulated as follows: 


Determine the pitch damper gain constant (K ) to 

minimize the difference in the actual (C ) and desired 
sp 


ratios; and 

subject to the constraints that 


(C 8p ) damping 


0 < a) <10 rad/sec. 
n 

sp 

Limitations on the control system are that 

the pitch rate feedback be limited to 4 deg/sec, 

the elevator deflection limits are +5*5 deg to -17 deg., and 

the elevator actuator rate limits are -26 to +26 deg/sec. 


Fig. 1 shows that the pitch damper gain constant to satisfy this design problem 
is 4.36. The existing system with * 0.2 is not adequate to provide a 

damping ratio of 0.65. Note that during the design process, motions along all 
axes have been imposed to provide any possible effect of inertial coupling. 

Control System to Prevent Yaw Divergence of an F-5A 

The second example is to design a control system to prevent yaw divergence 
of an F-5A during roll maneuver at high angles of attack. The aircraft is 
placed in a departure condition by a maximum constant elevator deflection to 
increase the angle of attack, followed by a constant roll control deflection of 
2 deg. The optimal control problem is formulated as follows. 

Determine the aileron-rudder interconnect gain the side acceler- 

ation feedback gain (K ), and the yaw damper gain (K ), to 

ay r 

maximize the roll rate, and 

minimize the sideslip angle (p), the yaw rate (r), and the change in 
heading angle ((J>). 
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In other words, in Eq. (5) the terms associated with a,., and la I are not 

trim 1 max 1 

used. Some results for the time variation of motion variables are shown in 
Fig. 2. It is seen that the present method is effective in satisfying the 
design objective by reducing both the change in yaw and bank angles. 

Control System to Prevent Pitch Departure of an F-16 

The aerodynamic data are obtained from ref. 9. Since the F-16 is unstable 
in pitch, design of a pitch control system is of major concern. The system 
includes an angle-of -attack and normal-acceleration feedback control. The 
airplane is first pulled up by applying the full stabilator deflection command 
(-25°). The objective is to minimize Eq. (5) with 

C l = 0.01, C 2 = 0.03, C 3 - 1, C 4 - 12, C 5 = 0.01 

= 0.008, C = 1 , C = 1 
o / o 

These weighting factors are chosen so that various terms in Eq. (5) have the 
same order of magnitude. The design variables are the various gain con- 
stants. Note that the a-feedback system is defined such that 

6 due to a-feedback « K a - K (6) 

e a c 

Two flight conditions are examined, one without imposing a roll maneuver 
after pull-up and the other with a roll maneuver. Results for the first case 
are presented in Fig. 3. It is seen that if there is no angle-of-attack 
limiting system (K^ =0, * 0 ) , the airplane will trim at an angle of attack 

equal to about 66 deg, which is the deep-stall condition. On the other hand, 
the limiting system would limit the trim angle of attack to about 25 deg. 

For the second case, a roll control of -10 deg is applied between t = 22 
and 34 sec. Note that roll should induce pitch-up due to inertial coupling. 

The results shown in Fig. 4 indicate that no departure has occurred and oc tr ^ m 

is determined to be 24.7 deg. By changing the initial time at which the roll 
control is applied, « tr £ m can still be determined to be about 25 deg. There- 
fore, it may be concluded that maximizing ct tr ^ m is to define approximately the 
limiting angle of attack. 

CONCLUSIONS 

Application of numerical optimization techniques to control system design 
was demonstrated for F-5A and F-16 configurations at high angles of attack. 

The methodology accounted for nonlinearities in aerodynamics and dynamics. 
Specific examples were presented to design control systems to satisfy flying 
qualities requirements and to prevent flight departure. The results indicated 
that the present method was effective in satisfying design objectives. 
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AWCLE OF ATTACK (a) , deg 



TIME t (sac) 

Figure 1 Time History of the Angle of Attack Responding 
to Impulsive Elevator, Aileron and Rudder 
Deflections for an F-5A with Pitch Damper 
(M-0.3, h=10,000 ft.). 
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Figure 3 Control System Design to Prevent Pitch Departure of an 
F-16 Configuration in a Pull-up Maneuver without Roll 
at M = 0.5 and h = 30,000 ft. 




Pitch Angle (0), deg Angle of Attack (a), deg 



Yaw Angle (rp) , deg Bank Angle (<t>) , deg Sideslip Angle (3) , deg 





Figure 4 Concluded. 
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Abstract 

A new class of robotic arm consists of a periodic sequence of truss substructures, each 
of which has several variable-length members. Such variable-geometry-truss manipu- 
lators (VGTMs) are inherently highly redundant and promise a significant increase in 
dexterity over conventional anthropomorphic manipulators. This dexterity may be ex- 
ploited for both obstacle avoidance and controlled deployment in complex workspaces. 
The inverse kinematics problem for such unorthodox manipulators, however, becomes 
complex because of the large number of degrees of freedom, and conventional solu- 
tions to the inverse kinematics problem become inefficient because of the high degree 
of redundancy. This paper presents a solution to this problem based on a spline-like 
reference curve for the manipulator’s shape. Such an approach has a number of advan- 
tages: (a) direct, intuitive manipulation of shape; (b) reduced calculation time; and 
(c) direct control over the effective degree of redundancy of the manipulator. Further- 
more, although the algorithm has been developed primarily for variable-geometry-truss 
manipulators, it is general enough for application to a number of manipulator designs. 


1 Introduction 

A new class of robotic arm consists of a periodic sequence of truss substructures containing variable- 
length members. Variable- geometry-truss manipulators (VGTMs) have emerged from various de- 
signs for deployable /control! able trusses for space-based applications [10,12,16]. An example of 
a VGTM is illustrated in Figure 1. The manipulator is a statically determinate truss comprised 
of linear members hinged together at joints. Some members axe actuated and can change their 
length; control of these lengths determines the manipulator’s shape as well as the position and 
orientation of the end effector. 

Truss manipulators promise a number of advantages over conventional anthropomorphic 
robot arms and space cranes, not the least of which is a significant increase in dexterity. This 
increased dexterity, however, has a price: a complex and unorthodox kinematic description. This 
paper will outline a kinematic methodology for VGTMs and set up a solution to the general 
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Fixed 

Member 



Figure 1: Example of a VGTM 


inverse kinematics problem. Furthermore, a new inverse kinematics algorithm — based on spline- 
like reference shape curves — will be presented and compared with the more conventional solution. 


2 Background to the Inverse Kinematics Problem 

The purpose of any robotic arm is to manipulate an end effector along a desired trajectory; this 
motion depends on the combined action of the manipulator’s actuators. The task of determining 
the correct actuator motion for a given end effector trajectory is the inverse kinematics problem. 

2.1 Direct Kinematics 

The end effector is defined by a set of n e coordinates, x«(<). This vector may contain up to six 
elements: three position and three orientation coordinates. The manipulator’s configuration is 
defined by the vector q(f) containing n q elements representing the manipulator’s internal degrees 
of freedom. In general, we may state the direct kinematics relationship in the form 

x* = f(q) (1) 

n e ^ n q (2) 

The inequality represents the possibility of redundancy . 

2.2 Differential Kinematics 

In general, the direct kinematics relationship (1) cannot be inverted directly to give the inverse 
kinematics solution. Instead, we resort to differential kinematics [1,15], whereby we arrive at a 
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Figure 2: Geometric Representation of a VGTM 

rate-linear system by expanding the time derivative of x e (t): 

x e (t) = J(q)q(t) 



The Jacobian matrix J has dimensions n c X n q . 


( 3 ) 

( 4 ) 


2.3 Inverse Kinematics Solution with Redundancy 

For a redundant manipulator, the Jacobian is not square and a solution to (3) is not straightfor- 
ward. A common formulation of a solution takes the form [6,7] 

q = J+x e + (1 - J+ J)q, (5) 

JJ + = 1 (6) 

The Moore-Penrose pseudoinverse [13], J + , represents a least-squares solution to (3); alternative 
solutions have used weighted pseudoinverses [5]. The vector q* is an arbitrary configuration velocity 
that may be used to minimize a cost function, avoid obstacles or fulfil some other objective [3,8,11]. 
The operator (1 — J + J) projects this velocity onto the null space of J so that the desired motion 
of the end effector is not affected. 


3 Kinematic Description of VGTMs 

Generalized techniques exist to develop kinematic descriptions for conventional anthropomorphic 
manipulators [2]. These techniques, however, are not well suited to truss manipulators. In this 
section, a general kinematic methodology for VGTMs will be developed. 
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3.1 VGTM Geometry 


Figure 2 shows a geometric representation of a truss manipulator. The linear members of the truss 
(actuated and nonactuated) are represented by straight lines having lengths Ik- Embedded in the 
joint mechanisms are the member endpoints , represented by the position vectors r,^. Associated 
with each joint — or truss node — is a node coordinate vector , p t ; each node vector represents a 
fixed point within the joint mechanism that defines the joint’s position in space. Since the truss 
structures are assumed to be statically determinate, we may calculate the orientation of each joint 
given the set of node vectors [p x , p 2 i • # ■ , Pm], where N is the number of joints in the truss. 

3.2 Configuration Variables 

For conventional robotic arms, the configuration variables are identical to the actuator variables, 
i.e., revolute joint angles and prismatic joint lengths. For VGTMs, however, the actuator variables 
— the nonfixed member lengths — are unsuitable when writing the direct kinematics; an inter- 
mediate set of variables must be used. The most appropriate choice for configuration variables 
are the node coordinate vectors p±, (i = 1,2, ... , N ). These variables have the following properties 
that make them suitable: 

1. Relationships may be found between the node vectors and useful external parameters. For 
instance, we may be interested in the position and orientation of a particular triangular face 
of the truss (where an instrument platform may be attached); these coordinates may be 
found if the node vectors describing the vertices of the triangle are known. 

2. Since the truss is statically determinate, we may calculate the endpoint vectors r t jt given 
p,, (f = 1,2,**-, N); once the endpoints are known, we may determine the length of each 
actuator. Hence, the actuator coordinates for the truss may be arrived at through this set 
of configuration variables. 

3.3 Kinematic Relationships 

There are three kinematic relationships by which we may determine the node coordinate vectors: 

Explicit External Constraints. This kinematic relationship involves node vectors that are 
known, explicit functions of time; these functions are defined independently of the truss configu- 
ration and hence are ‘external.’ If there are N c such constrained nodes, we may group them into 
the vector p c such that 

Pc = c°l[Pi(0] (7) 

t = 1,2 (8) 

Typically, these form the base nodes of a VGTM. For completeness, we also group the uncon- 
strained node vectors into p : 


( 9 ) 

( 10 ) 
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p = col[p t ] 

i = N c + 1, N C + 2,...,N 


Implicit External Constraints. This type of constraint describes the relationship of truss 
nodes to a set of external coordinates — such as the end effector coordinates x e and takes the 
form 

f(P>Pc) = x e(0 ( U ) 

where f (p, p c ) is a set of functions in the node coordinate vectors. 

Internal Constraints. The n, nonactuated members of a truss manipulator serve as hard con- 
straints to the motion of the nodes. These internal constraints take the implicit form 


{*ik — r jk) T { *xk T jk) — ■ ^ k 

(12) 

r ik — k(PyPc) 

(13) 

k = 1 J 2, . . . y Til 

(14) 

i,j — 1, 2, . • . , N 

(15) 


The length of fixed member k is expressed in terms of its endpoint vectors, r ik . If the truss nodes 
are perfect pin joints, the endpoint vectors become identical to the node vectors. 


3.4 Inverse Kinematics 

A unique solution for p may be impossible to determine because of redundancy. To resolve this 
redundancy, we first find rate-linear expressions based on the constraint equations. From the 
implicit external constraints (11), we form 

J u (p,p c )p + J c (p,p c )p c = x e (16) 


where 


Ju 

Jc 




(17) 

(18) 


From the internal constraints we get a set of n/ equations, ( k — 1,2, — , n i), that take the form 


( r t'fc — r jk)T l(J»k,t» — J)'fc,u)p T (J«'A:,e ^jk,c)Pc\ ® 


(19) 


J»jk,c 


fcik 

dp T 



( 20 ) 

( 21 ) 


(N.B. If the nodes are ideal pin joints, the nonzero portions of these Jacobians reduce to identity 
matrices, and we are left with a much simpler set of expressions [18]). We may group these n f 
equations into a system: 

J fl (p, Pc)'P = x s(p> Pc) ( 22 ) 
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Figure 3: VGTMs Approach Continuum 

where 

XgiP, Pc) = COl[— (r,/fe - Tjk) T {3 ik,c - 3jk,c)p c ]k 
^ g(Pt Pc) = c °l[(r,‘Jfc fjk) (J«fc,u Jjfc.u)]*: 

Combining (16) and (22), and defining 

x e - J c 'pc 

*g 


we arrive at the following system: 

J (P,P C )P = * 

The inverse kinematic solution of (27) takes the conventional form 

p = J+i + (1 - J+J)p, 



(23) 

(24) 

(25) 

(26) 

(27) 

(28) 


4 Inverse Kinematics Using Reference Shape Curves 

In the previous section, a solution to the inverse kinematics problem for VGTMs was found using 
the conventional methods outlined in §2* An alternative solution will now be presented, and a 
comparison will be made to the previous solution to illustrate some advantages associated with 
the new technique. 
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Figure 4: Space curves (Bezier) 


4.1 Basic Concept 

A distinctive feature of the VGTM design is its ability to assume curvilinear shapes [9]. Further- 
more, as illustrated in figure 3, the shape of a truss manipulator may be made to conform exactly 
to an arbitrary space curve in the limit as the number of bays gets large. It is this serpentine 
property of VGTMs that has inspired an inverse kinematics technique based on the modelling of 
the manipulator’s shape using reference shape curves. 

The proposed solution algorithm is as follows: 

1. When solving the inverse kinematics problem, replate the manipulator by a continuous space 
curve that satisfies the same external constraints. 

2. Force the manipulator to track this new desired shape. 

The reference shape curve may be thought of as the next dimensional extension of the end 
effector coordinates, x e : while the latter is defined at a point, the reference curve is a continuous 
coordinate distribution over a line. The second-order case will be a reference shape surface that 
may be used to model plate-like variable geometry trusses. 

The primary reasons for pursuing such a technique are 

1. A simplified, analytic expression for the manipulator’s shape will help in modelling interac- 
tions with the robot’s environment (e.g., collision detection, obstacle avoidance, controlled 
deployment). 

2. Less complex kinematic relations, resulting from the simplified shape model, will improve 
the computational efficiency of the inverse kinematics solution. 
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Figure 5: Coordinate Distribution Associated with Space Curve 

4.2 Type of Curve 

A certain class of space curve, found in computer graphics applications [17], has a number of 
properties that are desirable for this application. These spline-like curves parameterized by s 
(0 < s < 1) — have the form 

r(s,t) = 5>«Ptf) ( 29 ) 

t=0 

1 = (30) 

«=o 

(31) 

Example of such curves are Bezier curves, B-splines and Beta-splines. Because the weighting 
functions Wj(s) are a partition of unity, a point on the curve r is a weighted average of the 
control vertices p Joining the control verticies sequentially forms an open control polygon that 
approximates the actual shape of the curve; closeness of fit between the control polygon and the 
curve depends on the form of the weighting functions. Figure 4 illustrates this relationship for a 
Bezier curve. 


4.3 Coordinate Distribution 

Associated with the curve is a distribution of position and orientation coordinates, x(s,p): 


x(s,p) 


K«>P) 

0(s»p) . 

col [p*] 


P 
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(32) 

(33) 



Figure 6: Partitioning of VGTM 


Figure 5 shows that x(s,p) describes a frame of reference that ha s two properties: 

1, Its origin intersects the curve at r(s,p). 

2. One of its axes is tangent to the curve at the origin. 

Since the tangent is a function of p, two of the three attitude coordinates axe explicit functions 
of the control vertices. The third attitude coordinate — representing a torsion about the tangent 
— may be specified independently of the control vertices, and hence depends on the parameter s 
alone. 


4*4 Kinematics of a Curve 

The relationship between the control verticies and the overall curve shape allows direct, intuitive 
manipulation of the curve. Hence, the control vertices p will be defined as the configuraton 
variables for the curve. The direct kinematics relation follows from the fact that the coordinate 
distribution, x, is an extension of the original end effector coordinates, i.e., 

Xe(t) = x(l,0 (34) 

The rate-linear kinematic expression for any point on the curve is 

J 4 (a,p)p = x(s,0 (35) 


where J 5 — the curve Jacobian — is defined as 



Figure 7: Tracking the Curve 


To impose the implicit external constraint expressed in (34), we set 

J»(l,P)P = *e(0 ( 37 ) 

Following (5), the solution for the curve kinematics becomes 

p = J+x e + (l-J+J,)p. (38) 

As was the case for the general manipulator in §2 and for a VGTM in §3, the vector p, may be 
formulated to achieve a variety of objectives including obstacle avoidance. 

4.5 Tracking the Reference Curve 

The second step in the algorithm is to force the manipulator to track the reference curve. To 
do this, the manipulator is partitioned into smaller VGTMs (see figure 6); each segment may be 
described kinematically in the same manner as the whole VGTM. The implicit external constraints 
Xj(t) corresponding to each segment are then read off the reference curve at discrete points — 
[so,^i,...,s^] — called curve nodes, i.e. 

x»(f) = x(s,',p) (39) 

The VGTM segments may be solved in one of two ways: 

1. Recursively, whereby the implicit constraints for the fcth bay are given by x(s*,p), and the 
explicit constraints are provided by the ( k — l)st bay, or, 

2. In parallel, whereby each bay is solve independantly using only the implicit information 
provided by the curve. 

Figure 7 shows a manipulator segment tracking a curve that is executing a maneuver. 




Figure 8: Example Maneuver 


4.6 Example Maneuver 

Figure 8 shows sample frames from a 50-step maneuver involving a five-bay VGTM. The arm is 
directed to avoid two cube-shaped obstacles. The inverse kinematics has been solved using the 
reference curve technique; clearly, the technique works adequately in that the desired end effector 
trajectory is tracked, and a safe clearance is maintained between the robot arm and the obstacles. 

4.7 Advantages to the Reference Curve Technique 

Some of the advantages of the reference curve technique are summarized below: 

Improved Computational Efficiency To evaluate any improvement in computation time, 
a test trajectory was designed and both techniques — conventional and reference shape curve 
(recursive) — were used to solve the inverse kinematics for VGTMs of different lengths. The 
maneuver was partitioned into 50 time-steps, and the runs were carried out on an Apollo™ 
DN 4000 workstation. Figure 9 shows a comparison of the run-times for both methods; clearly, 
there is a marked improvement in computational efficiency when using the reference shape curve 
technique. This improvement may be attributed to the recursive nature of the tracking problem, 
as discussed in §4.5. 

Parallel Structure of Problem As stated earlier, the tracking problem may be solved in 
parallel as well. This natural parallel structure makes the reference curve technique conducive to 
a multiprocessing environment [14]; in such an architecture, dedicated processors may be used to 
solve the inverse kinematics of the VGTM segments concurrently, thereby providing a quantum 
improvement in efficiency. 
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Figure 9: Run-time Comparison 


Analytical Description of Manipulator Shape An analytical expression for the manipu- 
lator’s shape is helpful when describing robot /environment interactions. An important part of 
many obstacle avoidance routines involves finding the closest point to the obstacle on the manip- 
ulator [4,5,11]. Using the shape curve equation and quickly- converging iterative solvers, we may 
readily solve for these critical points. An analytical description may also be incorporated into a 
mathematical model of the manipulator’s workspace to predict collisions. 

Variable Degree of Redundancy Since the global inverse kinematics problem has been shifted 
from the manipulator to the reference curve, we may specify the degree of redundancy a priori 
by choosing the number of control vertices used to describe the space curve. For instance, if it 
is known that the VGTM is to operate in a very simple, obstacle-free workspace, then only a 
minimum number of degrees of freedom — enough to satisfy the external constraints need be 
included; conversely, a complex workspace will demand the use of more control vertices. There is 
a limit, of course: it will be impossible for the manipulator to track a shape curve that has more 
degrees of freedom than itself. Yet, this flexiblity in choosing the degree of redundancy may be 
used to reduce computational overhead. 


4.8 Generality 

While developed for application to truss manipulators, the reference curve method is general 
enough to find use as a redundancy resolution technique with more conventional robotic arms. 
The first step in the algorithm remains the same since the kinematics of the reference curve is 
independant of the manipulator. The second step — the tracking problem — may be applied to 
any robotic arm that can be easily segmented into smaller manipulators. 
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5 Conclusions 


A generalized kinematic description has been presented for variable-geometry-truss manipulators. 
From this description, a solution to the inverse kinematics problem was found by conventional 
means. An alternative technique — based on reference shape curves — was developed and applied 
to VGTMs in the hopes of improving the efficiency of the straightforward approach, as well to 
inject some geometric insight in the description of the manipulator’s shape. The new technique 
succeeded in solving the inverse kinematics problem with a significantly decreased computation 
time. Further advantages were noted: 

• Technique is attractive to obstacle avoidance and collision detection applications; 

• Problem structure is applicable to parallel processing; 

• Variable degree of redundancy 

• Generality 

The preliminary success of this new method is encouraging for the development of real-time-control 
robotic facilities based on truss manipulators. 
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ABSTRACT 

The prescribed tasks in high speed robotic systems are severely 
deteriorated because of their manipulator dynamic deflections. On the 
other hand conventional dynamic modeling techniques fail to reveal 
appopriate control forces in flexible systems. In this paper the 
conventional dynamic equations of motion for systems subject to 
kinematical constraints are modified by a new concept of control force 
representation. The directions of the control forces are selected such 
that they correspond to the joint degrees of freedom. Then the joint 
control forces and torques that yield unperturbed prescribed motions are 
solved simultaneously with the system motion. A flexible manipulator is 
presented to illustrate the methods proposed. 


1. INTRODUCTION 


The operation of high speed robots is severely limited by their 
manipulator dynamic deflection. The vibrations deteriorate the accuracies 
of the prescribed tasks assigned to certain points and significantly 
reduce the robot arm production rate. Hence determination of the 
appropriate control forces and torques at the joints that yield stable 
prescribed motions is an important control problem. 

In this paper geometrical constraints represent geometrical 
restrictions such as closed loops and physical guides. On the other hand 
kinematical constraints represent prescribed desired paths or prescribed 
motions of certain points or bodies. Such prescribed motions are to be 
realized by control forces applied by the actuators in the system which 
are usually placed at the joints. 

In the conventional approach, constraints in the system are modeled 
by constraint reaction forces whose directions are perpendicular to the 
constraint surfaces. (See Yoo and Haug [1], Shabana [2], Kamman and Huston 
[3], Hemami and Weimer [4], Nikravesh [5].) Using conventional methods, 
when the prescribed motions are treated as constraint equations and 
embedded into the governing equations of motion, the corresponding 
generalized constraint reaction forces can be determined. However these 
forces cannot be utilized as physically possible control forces due to the 
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presence of components that correspond to the elastic coordinates. For 
this reason previous solution procedures involved feedback and adaptive 
control algorithms which in turn increase the complexity of the problem 
considerabl y . 

In this paper the kinematical constraints are modeled by general 
direction control forces. The modified equations of motion for flexible 
multibody and robotic systems subject to geometrical and kinematical cons- 
traints are developed. The directions of the control forces are selected 
such that they correspond to the joint degrees of freedom. By this way 
joint control forces and torques that achieve the unperturbed prescribed 
motions are solved simultaneously with the corresponding system motion. 
The modeling of flexible multibody systems in joint space based on finite 
element method and component mode synthesis, as developed in references 
Ider [6], Ider and Amirouche [7] is also outlined. In the equations of 
motion all nonlinear interactions between the rigid body and elastic 
coordinates are automatically incorporated. 

This paper is divided into seven sections. The first section provided 
an introduction. In the second section kinematics and constraint equations 
in flexible multibody systems are outlined. The conventional equations of 
motion for constrained systems are presented in the third section. In 
section four the problems with the conventional approach are discussed. In 
the fifth section the modified equations of motion with general direction 
control forces are developed. Sixth section presents the simulations of a 
flexible manipulator by the proposed method. Conclusions form the last 
section. 


2. FLEXIBLE MULTIBODY KINEMATICS AND CONSTRAINT EQUATIONS 


In a multibody system each joint connection can be described by a total of 
six degrees of freedom. The constrained joint coordinates are eliminated 
In the analysis, hence all possible joint types are allowed. The system 
may contain closed loops and any selected points may have prescribed 
motions. First the recursive dynamical equations are developed for a tree 
configuration which is obtained by cutting the closed loops open (using 
any arbitrary joint in the loop). Closed loops and prescribed motions are 
then Imposed as a set of constraint equations. 

In Figure 1, a typical deformable body Bk and its lower connecting 
body Bj are shown. The joint between Bk and Bj is the lower joint of Bk 
and Is defined by points Qk and Qk and axis frames n k and n k * fixed at 
these points. The elastic deformations are modeled by finite element 
method with respect to a body reference axis frame denoted by N k . N k , in 
general, is not fixed to a point on the body. It follows the rigid body 
motion of Bk in a manner consistent with the specified boundary 
conditions [6,7]. 

The position of the system can be described by the relative joint 
coordinates of each body and the modal coordinates of the flexible bodies. 
Translation of n k with respect to n k * is denoted by vector z k . For the 
relative rigid body rotation degrees of freedom, successive Euler angles 
in transforming n k to n k * can be used. The modal coordinates nj. j=1,..,m k 
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Figure 1. A multibody system 


of each body Bk represent the normal modes of deformation obtained by 
component mode synthesis and m k is the number of the modes considered. 

Let vector w k represent the angular velocity of n k with respect to 
n k *. Then the generalized speeds of the system can be conveniently 
selected as the relative angular velocity components, the relative 
translational velocity components and the modal coordinate derivatives. 
The vector of the system generalized speeds y can be defined as 


where 


y - [w T , zT,»n T 


w = [w! , wi, wj wf, w?, W?F 

i = [z{ , z[, z \ , . . . , z?, zl, z^] T 


and 


n - i ,< i] ’ • * • » » • • • > » • • • » nJJ N 


(D 

( 2 ) 

( 3 ) 

( 4 ) 


For the dynamical equations we need the velocity, in fixed frame R, 
of an arbitrary point P in finite element i of body Bk. To this end, the 
angular velocity of N k in R, w k , is obtained by summing successive 
relative angular velocites and can be compactly expressed as 
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w k = v* w + p k n 


(5) 


where i/ k and p k are the partial angular^velocity matrices composed of the 
coefff icients of the generalized speeds w and t\, respectively. 

It can be shown [6,7] that the velocity of point P in R could be 
written as 


v ki = a ki w + b k z + c ki n (6) 


where a k1 , b k and c M are the partial velocity arrays associated with w, z 
and n, and are functions of displacements only. 

Depending on each joint type, the constrained joint coordinates are 
then eliminated in the generalized speed vectors w and z (equations (2), 
(3)). When the corresponding columns of the partial velocity matrices are 
eliminated, v k and a ki are 3xm matrices, and b k is a 3xn2 matrix, where 
ni is the total number of the free joint rotation degrees of freedom and 
n 2 is the total number of the free joint translation degrees of freedom. 
The remaining arrays i* k and c M are 3xm (m=m 1 +. . ,+m N ). 

Let the tree structure have n degrees of freedom (n=ni + n 2 + m), and 
let the total number of closed loop and prescribed motion type of 
constraints be c. Then the system’s degrees of freedom reduce to n-c. 

The constraint equations can be generated using the partial velocity 
matrices. For example, if a point say A in Br has a prescribed motion, and 
the prescribed velocity vector of that point is given by g(t) with respect 
to R, then denoting the local undeformed vector from Or to A by $ r .the 
resulting three constraint equations are 


a r1 w + b r z + c ri q = g 


(7) 


where a M and c r1 correspond to s r . 

Similarly if the refence axis frame of Br has a prescribed angular 
velocity h(t), we have 

v r w + p r q = h (8) 

For a closed loop type of constraint, let Br and B« connect with each 
other to form a closed loop in 3-D. Differentiation of the position vector 
equation expressing loop closure leads to the three velocity level 
constraint equations, 


( a M_ a s<) w + (b r -b a ) z + (c r ’-c ai )n = 0 


(9) 


The holonomlc and nonholonomic constraint equations can be compactly 
written as 
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B y = g 


( 10 ) 


where B is a cxn constraint matrix and g contains prescribed velocities. 


3. CONSTRAINT REACTION FORCES AND EQUATIONS OF NOTION 


Kane’s equations for the constrained system can be written as 

F + F* + S+F c = 0 (11) 

where F, F*, S and F c are respectively the vectors of generalized 
external, inertia, stiffness and constraint forces. 

The generalized inertia forces F* can be written in the following 

form, 


F* : H y + Q (12) 

where individual submatrices of M and Q can be expressed in terms of the 
partial velocity matrices [7] as, 


M 


= zzf 

k 1 Vk 


a ki T a ki 


sym. 



a k ’ T b k 

b kT b k 


p dV 

(13) 

a kiT c ki 

b kT c M 

c ki T c ki 




and 


-zzf 

- Vm 


k 1 


a k ’ T (a ki 

A 

w 

+ 

b k 

i 

+ 

c ki ii ) ‘ 

a kiT (a ki 

A 

w 

+ 

b k 

z 

+ 

c ki n ) 

a kiT (a ki 

A 

w 

+ 

b k 

2 

+ 

C k <q ) . 


/odV 


(14) 


The stiffness vector S is obtained from the structural and 
geometrical stiffness matrices of each body expressed in modal 
coordinates. 

The generalized constraint forces F c can be expressed as 


F c = B T X 


(15) 


where X is the vector of undetermined multipliers. Since the rows of B are 
the partial velocity vectors, Xi , 1 = 1 ,..., c represent the constraint 
reaction forces generated at the application of the constraints. A row of 
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B can also be viewed as the direction of that constraint in the 
generalized space. 

Substitution of equations (12) and (15) into eq. (11) leads to 

My + S + Q+B T A-F 06) 

Our purpose is to find the accelerations for numerical integration. 
To this end the constraint equations in the acceleration level are 

B y r g ~ B y (17) 

Equations (16) and (17) constitute n+c equations from which the 

accelerations and the undetermined multipliers can be obtained. 

The multipliers could be eliminated for computational efficiency. To 
this end, let C denote a nx(n-c) matrix which is orthogonal complement to 
B [8]. Premultiplying eq. (16) by C T , and combining the resulting equation 
with eq. (17), we obtain the augmented equations for the constrained 
system as below. 


C T M ' 


’ C T (F-S-Q) ' 

B 

y = 

g-By 


4. PROBLEMS WITH THE CONVENTIONAL APPROACH 


In the conventional approach the constraints in the system are modeled by 
constraint reaction forces which are perpendicular to constraint surfaces. 
They represent the reactions of the environment. However kinematical cons- 
traints represent desired motions and are meant to be realized by internal 
control forces. Kinematical constraints are particularly important in 
robotic systems where certain points are assigned specific tasks that 
should be realized by joint actuators. 

Let ci of the constraints in the system be geometric and the 
remaining C 2 (c 2 =c-ci ) be kinematical. The matrix of the constraint force 
directions B and the vector of constraint force magnitudes A can be 


partitioned such 

that 


B = [B qT 

B* r ] T 

(19) 

and 



A = tA GT 

A Kt ] 

(20) 


where the dimensions of B G , B K , A G and A K are cixn, C2xn, ci and C2 
respectively. 

Then eq. (16) can be written in the following form 
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My + Q+ S+F G + F K = F 


( 21 ) 


where the generalized constraint forces F G and F K corresponding 
respectively to geometrical and kinematical constraints are 

F G = B« t X g (22) 


and 


F K = B kT X k (23) 

The constrained system as given by eq. (18) could be simulated to 
determine the generalized constraint forces. In view of eq. (21), then 
control forces numerically equal to F K would yield the same motion of the 
system ensuring the realization of the desired motions. 

However, in flexible systems F K contains components that correspond 
to the elastic coordinates in addition to the components that correspond 
to the joint coordinates. While the latter can be applied by the joint 
actuators as control forces, the former cannot be produced by a physical 
means as control forces. That F K has components in the direction of the 
elastic coordinates is apparent from eqs. (7) and (8) where the 
coefficients of the generalized speeds form the vectors of B K in eq. (23). 
Hence, with the conventional approach it is not possible to design a set 
of control forces that can achieve unperturbed prescribed motions in 
flexible robotic and multibody systems. 


5. CONTROL FORCES FOR KINEMATICAL CONSTRAINTS AND MODIFIED EQUATION OF 
MOTION 

Since kinematical constraints are to be realized by control forces in the 
system, general direction control forces are introduced to the equations 
of motion, so that 

My + Q+ S + B G X qT + B kT X k + AV - F (24) 

where A is a C 2 xn matrix of control force directions and /n is a C 2 
dimensional vector of control force magnitudes. Let the control force 
directions are selected such that the constraint reaction forces 
corresponding to the kinematical constraints become zero. Then eq. (24) 
can be written as follows, as shown in accompanying paper (Ider [9]). 


My + Q + S + Z T v=F (25) 

where 

Z T = [B qT AT] (26) 
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and 


v T = [X qT r* T ] (27) 

The directions A need to be selected from physical considerations and then 
equations (25) and (17) can be solved together to compute the control 
force magnitudes and the corresponding generalized accelerations. A should 
be chosen such that rank of Z is c, so that C 2 kinematical conditions 
could be controlled. 

Let C be a cx(n-c) matrix orthogonal complement to Z. Premultiplying 
eq. (25) by C T and augmenting with eq. (17), we obtain reduced equations 


C T M 


C T (F-S-Q) 

B 

y = 

. g-By . 


(28) 


The selected control force directions can realize the prescribed 
motions if and only if the augmented mass matrix in eq. (28) is non 
singular. Hence singularity of the augmented mass matrix represents a 
condition to test the solution. In other words, the directions A should be 
such that the vector space spanned by the rows of B and the vector space 
spanned by the rows of 6 T are nonintersecting [9]. 

It has been observed that for flexible robotic systems if the control 
forces are selected in the directions along the corresponding joint 
degrees of freedom the augmented mass matrix becomes full rank and hence 
it Is possible to realize the kinematical constraints by actuators at the 
joints. This will be illustrated by the simulations of a flexible 
manipulator in the next section. 


6. SIMULATIONS OF A FLEXIBLE MANIPULATOR 


In the planar manipulator shown in Figure 2, link 2 is a flexible link, 
while link 1 is treated rigid. The data used for link 1 are Li=1m, mi=30kg 
and I i = 1 0 kg.m 2 . Link 2 is modeled by beam elements with deformation 
displacement and rotation nodal coordinates [10], and L2=2.7m, a=1.8m, 
m2=15kg, E=68. 95x10® N/m 2 and area A=0.0005 m 2 . The longitudinal 
deformation is neglected due to the axial stiffness and the transverse 
deflection is described by the first two modes since higher modes were 
observed to be negligible. Therefore the generalized coordinates of the 
system are 0i , 02, and modal coordinates qi and q 2 . The generalized speed 
vector 

y = [0i , 02 , n 1 , n 2 ] T . 

Initially the system is at rest, and 9i=80° and 02=-16O°. Point A on 
link 2 is required to deploy from the given position 1.5m horizontally. 
The prescribed motion of point A is given as 
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Figure 2 . Flexible manipulator 


xa = 1.5 — (t --^sin^^) + 0.4862 
yA = -0.7878 

The constraint equations in the system can be expressed as 


(29) 


LlCl + L2C12 - Si 2 ( <Pl n 1 + $ 202 ) = XA 

(30) 

Lisi + L2S12 + ci 2 ($1 n 1 + $2 02 ) = yA 

where 0 i and $2 are the values that correspond to the location of point A 
in the first and second eigenvectors respectively. ci=cos 0 i, 
ci 2=cos(0i +02 ) , si=sin 0 i and si2=sin(0i+02). 

At the acceleration level the constraints are given by eq. ( 17 ) where B 
and g are 


B = 


Li S 1 +L 2 S 1 2 +C 1 2 ( (pi n 1 +® 2 H 2 ) 
L1CI+L2CI 2— S12 (tplOl + 02 H 2 ) 


L 2 SI 2 + Cl 2 (Ip 1 0 1 +02(1 2 ) Ol Si 2 (P 2 Si 2 ' 

L2Cl2-Sl2(dini+(>2fl2) (pi Cl 2 <|>2 Cl 2 . 


and 


g = [xa , o] T . 


( 31 ) 
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First the system is simulated using the conventional method. Notice 
that both constraints in the system are kinematical. Our objective is to 
determine joint moments Mi and M 2 that would produce unperturbed desired 
motion of point A. The generalized constraint forces in eq. (15) are 




Xi {Li si +L2S1 2+ci 2 ((pi Hi +d>2 H 2 ) ] }+ X2 {Li C1 + L2C1 2 -SI 2 (<Pi H 1 +<P2il2 ) } 

Fz 


Xl {L2Sl2 + Cl2(<Pini+<H2rj2)} + X2 {L2Cl2-Sl2((pini +<>2 02 ) } 

f! 


Xi $1 si 2 + X2 ® 1 Cl 2 

fS. 


.Xl $2 Si 2 + X2 $2 Cl 2 


The system is simulated for the deployment motion period T=1sec. The 
generalized constraint forces obtained are plotted in Figure 3. ^ Notice 
that if one considers Fi and F§ as joint control moments, f! and F \ will 
be left unaccounted. They cannot be converted to any set of physically 
applicable control forces or moments, and a simulation only with Fi and F 2 
as control moments Mi and M 2 produces perturbations for point A. 



Figure 3. Generalized constraint forces using conventional 
method: 1 . Fi , 2. F 2 , 3. F 3 , Fa 
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Figure 4. Joint moments for unperturbed motion of A. 
Flexible system: 1. Mi , 2. M 2 
Rigid system: 3. Mi , 4. M 2 


The system is then resimulated by the control force approach 
presented in this paper. The control force directions are selected such 
that they correspond to the joint coordinates 9i and 02 , i.e. 



The control forces Z T v become 



(34) 


This means that the required joint moments for unperturbed motion of point 
A are Mi = 1/1 and \\ 2 =vz . 
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With the above control force directions, the augmented mass matrix 
was observed to be full rank, i.e. singularity (or near singularity) did 
not occur, conforming with physical expectations. The joint control 
moments Mi and M 2 that produce unperturbed motion of point A are plotted 
in Figure 4. 

For comparison, the system is resimulated with both bodies considered 
rigid, and the joint moments corresponding to the rigid system are also 
shown in Figure 4. The difference in the control moments for the flexible 
system accounts for the effects of the elastic deformations. 


7. CONCLUSIONS 


This paper presented a general procedure to determine the joint control 
forces and torques in flexible robotic systems, that realize prescribed 
motions in an unperturbed manner. The method is based on a new approach 
for modeling kinematical constraints by general direction control forces. 
The control forces have been selected along the directions of the joint 
degrees of freedom in the generalized space, and the control force 
magnitudes are solved simultaneously with the corresponding system motion. 

It has been shown that with the conventional approach of 
perpendicular constraint forces a solution to the problem cannot be 
obtained. 

In the analysis the flexible bodies have been modeled by finite 
element method and all interactions between the rigid and elastic motion 
have been included. By the procedures presented in this paper the body 
flexibilities can be controlled by applying forces and torques at the 
joints. 


REFERENCES 

1. Yoo, W. S., Haug, E. J., 1986, "Dynamics of Articulated Structures. 
Part I. Theory", Journal af Structural Mechanics . Vol. 14, pp. 

105-126. 

2. Shabana, A. A., 1985, "Automated Analysis of Constrained Systems of 
Rigid and Flexible Bodies", ASME Journal af Vibrations. Acoustics. 
Stress and Reliability in Design, Vol. 107, pp. 431-439. 

3. Kamman, J. W., Huston, R. L., 1984, "Dynamics of Constrained Multibody 
Systems", ASME Journal af Applied Mechanics. Vol. 51, No. 4, 

pp. 899-903. 

4. Hemaml, H. , Weimer, F. C. , 1981, "Modeling of Nonholonomic Dynamic 
Systems with Applications", ASME Journal af Applied Mechanics . 

Vol. 48, No. 1, pp. 177-182. 


432 


5. Nikravesh, P. E., 1984, "Some Methods for Dynamics of Constrained 
Mechanical Systems: A Survey", NATO ASI Series . Vol. F9, Springer- 
Verlag, Berlin, pp. 351-368. 

6. Ider, S. K. , 1988, "Stability and Dynamics of Constrained Flexible 
Multibody Systems", PhD Thesis, University of Illinois at Chicago. 

7. Ider, S. K. , Amirouche, F. M. L., 1989, "Non 1 inear Model ing of 
Flexible Multibody Systems Dynamics Subjected to Variable Constraints" 
ASM£ Jpurnal of Applied Mechanics . Vol. 56, No. 2, pp. 444-450. 

8. Ider, S. K. , Amirouche, F. M. L., 1988, Coordinate Reduction in the 
Dynamics of Constrained Multibody Systems", ASME Journal of AoDlled 
Mechanics. Vol. 55, No. 4, pp. 899-905. 

9. Ider, S. K., 1989, Modeling of Control Forces for Kinematical Cons- 
traints in the Dynamics of Multibody Systems-A New Approach", 3rd 
Annual Cpnf, oq .Aerospace Computational Control . Oxnard, CA, Sept. 

10. Przemleniecki , J. S. , 1968, Theory of Matrix Structural Analysis 
McGraw-Hill, New York. 

11. Naganathan, G. , Soni, A. H., 1986, "Nonlinear Modeling of Kinematic 
and Flexibility Effects in Manipulator Design", ASME Paper, 86-DET-88. 

12. Kane, T. R., Ryan, R. R. , Banerjee, A. K., 1987, "Dynamics of a 
Cantilever Beam Attached to a Moving Base", Jour nal of Guidance 
Co ntrol and Dynamics . Vol. 10, No. 2, pp. 139-151. 


433 



N90- 23035 


CONTROL OF A FLEXIBLE PLANAR TRUSS 
USING PROOF MASS ACTUATORS 
Constantinos Minas* 

Ephrahim Garcia 
Daniel J. Inman 

Department of Mechanical and Aerospace Engineering 
State University of New York at Buffalo 
Buffalo, N.Y. 14260 


* A flexible structure was modelled and actively controlled by using a single space realizable 
linear proof mass actuator. The NASA/UVA/UB actuator was attached to a flexible planar truss 
structure at an "optimal” location and it was considered as both passive and active device. 1 he 
placement of the actuator was specified by examining the eigenvalues of the modified model that 
included the actuator dynamics, and the frequency response functions of the modified system. 1 he 
electronic stiffness of the actuator was specified, such that the proof mass actuator system was 
tuned to the fourth structural mode of the truss by using traditional vibration absorber design. 1 he 
active control law was limited to velocity feedback by integrating of the signals of two 
accelerometers attached to the structure. The two lower modes of the closed-loop structure were 
placed further in the LHS of the complex plane. The theoretically predicted passive and active 
control law was experimentally verified. 

Large continuous structures, like space structures tend to have tight restrictions on the 
actual response of the structure. A passive or active control design is often necessary for the 
structure to satisfy the desired response restrictions. The success of the passive and active control 
design is based on the accuracy of the model that describes the dynamic characteristics of the 
structure. Flexible distributed parameter systems can be successfully modelled by finite element 
analysis 1. This category of structures is lightly damped and tends to have most of its mass 
concentrated at the joints 2. Their natural frequencies are low and appear in closely spaced groups. 
The finite element model of the structure that consists of a mass and a stiffness matrix, can be 
reduced by traditional model reduction techniques by eliminating the insignificant displacements at 
the nodal points 3 . The dissipation energy of the system can be modelled by constructing a system 
damping matrix, by assuming a normal mode system 4 , and by using the damping ratios obtained 

experimentally from modal parameter estimation methods » > . In the case where the 

discrepancy between the analytical model and the experimentally obtained modal model is 
significant, the reduced order analytical damped model can be further modified .such that it is in 
agreement with the experimental natural frequencies, damping ratios and mode shapes 
8,9,10,1 1,12,13. It i s important to realize that the design of the "optimal” control is based on the 
modified reduced order model, but it is actually applied to the real structure. Therefore, the model 
improvement mentioned above, becomes very important and its accuracy is vital in the success ot 

the design of the control law. , . . . 

The structure used here, is a planar truss constructed with space realizable links and joints 
in the configuration presented in fig. 1. The truss is lightly damped and has the behavior of a large 


♦Currently with G.E. Corporate Research and Development Center, Advanced Projects 
Laboratory, Schenectady, N.Y. 12301. 
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space structure, with most of its mass concentrated at the joints 2. It possesses low resonant 
frequencies that appear in closely spaced groups and has both translational and rotational modes of 
vibration. 

The structure is passively and actively controlled by a single actuator. The actuator used in 
this experiment is the NASA/UVA/UB proof mass actuator system. The actuator dynamics are 
taken into consideration and a global model is constructed which includes both the structure and the 
actuator dynamics 14,15 The location of the actuator is specified 16,17 by examining the 
eigenvalues of the uncontrolled global model and the frequency response functions of the global 
system. The actuator is considered as both a passive and an active device with two design 
variables, its electronic stiffness and the generated force. The electronic stiffness is specified such 
that the actuator proof-mass-electronic-spring system is tuned to one of the structural modes of the 
truss by using traditional vibration absorber design 18, 19,20 The generated force of the actuator 
is specified by using output feedback techniques. Here, the active control law was limited to 
velocity feedback by integrating the signals of two accelerometers attached to the structure. The 
objective is to move the two lower modes of the closed-loop structure further in the LHS of the 
complex plane and at the same time maintain stability of the closed-loop system 21,22 The 
theoretically predicted passive and active control law are experimentally implemented and the 
results are evaluated. 

2. Modeling 

2.1 Construction of the Finite Element Model 

The finite element model of the structure was constructed by using the commercially 
available MSC/PAL package for dynamic modeling. The structure weighed 7.335 Kg and was 
constructed with links and joints, mainly made of aluminum alloy. The density of the material was 
measured experimentally by using standard techniques. The Young’s modulus of aluminum alloy 
was used, since the links and joints are mainly constructed with this material. The nodal points of 
the finite element model coincide with the location of the joints of the structure. Every nodal point 
was allowed to have three degrees of freedom, that is translation in the z-axis and rotations about 
the x and y-axis resulting in a 48 -degree-of- freedom model (see Fig. 1). The boundary conditions 
were assumed to be clamped for nodes 15 and 16 and free for the rest of the nodes, since the 
structure was supported as illustrated in fig. 1. After the boundary conditions were applied the final 
model was a 42-degree-of-freedom model. 

2.2 Mass Distribution 

The mass distribution of a non-uniform structure is a problem, that should by no means be 
ignored. Here, two approaches were used. The first approach was to calculate an equivalent 
internal diameter of the hollow links, such that the links had the measured mass. The links were 
treated as uniform hollow tubes constructed with aluminum alloy with an equivalent length of 
0.5m. The joints were modelled as a concentrated mass at the particular location and are treated as 
rigid. The natural frequencies of this model were calculated and are presented in table 1 . The 
results were considered unsatisfactory and one of the links was disassembled for more insight to 
the mass distribution of the link. In the second approach, the real internal diameter of the links was 
used and the excessive mass was distributed to the nodes accordingly. The resulting natural 
frequencies of the model are compared to the experimental results in table 1. The finite element 
model was constructed using a finer grid which include more nodal points, specifically an 
additional nodal point at the mid-point of each link. The resulting model after the boundary 
conditions were applied was a 126-degree-of-freedom model. 

It can be concluded that the 45-node( 1 26-dof) model is not significantly better than the 16- 
node(42-dof) model in predicting the first fourteen natural frequencies. Therefore, it was found 
unnecessary to use the 45-node(126-dof) model in the determination of the control design of the 
structure, since the 16-node(42-dof) model was as accurate. 
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Tabic 1 : Comparison of the theoretical and experimental natural frequencies of 


1 : FEM ~~~~ 1 

TEST 1 (rot accel) ! 

Uniform 

mass distribution 

Corrected mass distribution 



42dof 

42dof 

126dof 

14dof 

SDOF analysis 

Freauencv in Hz 

-J 


i 

1.38 

1.045 

1.048 

1.039 

1.07 



2 

4.56 

3.467 

3.468 

3.469 

3.54 



3 

10.88 

8.050 

8.050 

8.051 

7.94 | 



4 

26.98 

19.894 

19.894 

19.902 

- 



5 

29.68 

21.746 

21.748 

21.750 

- 



6 

30.94 

22.077 

22.074 

22.087 

22.54 



7 

42.63 

30.468 

30.472 

30.477 

32.61 



8 

53.79 

39.268 

39.252 

39.326 

40.35 



9 

68.46 

48.524 

48.521 

48.552 

- 



10 

72.61 

51.746 

51.704 

51.842 

52.51 



11 

82.93 

58.645 

58.629 

58.718 

61.41 



12 

101.93 

71.169 

71.116 

71.275 

65.62 



13 

102.88 

72.090 

72.039 

72.285 

78.24 



14 

116.52 

80.741 

80.610 

80.920 

91.74 



15 

236.64 

219.856 

183.903 

- 

187.13 



2.3 Model Reduction 

Most of the control algorithms are designed for first order systems. Transforming the lo- 
node(42-dof) model in the state space results in a 84-dof state space matrix. This matrix is quite 
large, and it was found that it is difficult to manipulate in vibration prediction, and control 
algorithms. Therefore, it was necessary to reduce the order of the model before performing control 
analysis and designing a control law. From the configuration of the model the rotational degrees of 
freedom can be considered as less significant than the translational ones, and can be eliminated 
from the model by using the Guyan reduction method 3 . The resulting reduced order model is a 
14-dof model. Eigenvalue analysis of this model showed that this model maintained the first 
fourteen natural frequencies of the larger model quite accurately. The damping ratios determined 
from the modal test were used in the construction of the system’s damping matrix, by assuming 
that the system exhibited normal mode behavior. The damping matrix is calculated by the 
following equation: 

D = MU F diag(2C i io i )U F - 1 0 ) 

where U F is the eigenvector matrix of M^K, and ^are the experimentally obtained damping 
ratios. The final reduced order model is described by the following equation: 

Mq(t) +Dq(t) + Kq(t) “0 

This equation describes only the dynamic characteristics of the structure. The actuator dynamics 
were considered important and they were included in the dynamic model. 


2.4 Actuator Dynamics 

The actuator that was used in this experiment was the NASA/UV A/UB proof mass 
actuator, presented in fig.2. The actuator system is comprised of a movable proof mass (m prf = 
0.225Kg), a fixed coil that applies an electromagnetic force on the proof mass, an analog interface 
board, a power amplifier and a linear variable differential transformer (LVDT) sensor. The LVDT 
transducer is an electromechanical transducer that measures the relative position of the proof mass 
with respect to the actuator housing. The actuator can be modelled as single degree of freedom 
mass-spring system, with a variable electronic stiffness and the ability to apply a force on the 
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structure at the attachment point. An equal and opposite force is applied on the proof mass of the 
actuator. The actuator is space-realizable in the sense that it does not have to be attached to the 
ground. The equations of motion are written by taking into account the actuator dynamics^. 
Let’s assume that the actuator is attached to the structure at the ith nodal point. The global system 
that includes both the structure and the actuator dynamics, is of higher order, equal to the order of 
the original system plus the order of the actuator dynamics, and it is described by: 



where q prf is the displacement of the proof mass (m^), the scalars k^and c^are the stiffness and 
damping of the electronic spring of the actuator, m™. is the parasitic mass of the actuator, f g is 
force generated by the actuator, and the matrices M j ,D j and Kj are the following matrices: 

Mj = M + m diag[0,... ,0,1,0, ...,0] (3b) 

K, = K + k Kt dMO,...,0 > 1.0,....0] (3c) 

D, = D + c act diag[0,...,0,l,0,...,0] (3d) 

This is referred to as the open-loop system and the mass, damping and stiffness matrices are 
denoted by subscript ( Q l) f° r convenience. Note that the non-zero elements correspond to the ith 
row oi/and column of the particular matrix or vector of the previous set of equations. The force f g 
is the actuator-generated force applied on the structure. The electronic stiffness of the actuator can 
be selected in a variety of ways for various design approaches. 


3. Passive Control Design 

3.1 Structural Modification Design 

The parasitic mass of the actuator housing has the same effect as adding a dead parasitic 
mass at the point of attachment. Increasing the mass of the structure is a structural modification, 
with the direct effect of reducing the lower natural frequencies of the system. The natural 
frequencies of the new model with the dead mass were examined both theoretically and 
experimentally, and the results are tabulated in table 2. The experimental results are presented in 
the form of point and transfer inertance (transfer function) plots. The transfer function of nodes 1 
and 8, of both the original structure and the modified structure are presented in fig.3 and fig.4 
respectively. The effect of attaching the PMA (inactive) was also examined. This configuration is 
equivalent of having a dead mass equal to the parasitic mass of the actuator housing plus the proof 
mass. However, when the actuators electronic stiffness is activated, the proof mass becomes an 
additional degree of freedom, and it is not part of the parasitic mass any longer. 

The results indicate that the modified structure has lower natural frequencies than the 
original structure. This is true for the first five structural modes as indicated in the table above. 
The experimental frequency response plots show that the level of the vibration response was 
reduced considerably, especially in the lower frequency region. 

If the design methodology was limited to structural modification, it will be considered 
necessary to examine the effect of adding the dead mass at different nodal points. The results are 
presented in table 3. The design criterion that was used to place the actuator was to reduce the 
overall vibration level at node 1 , because a sensitive device will be attached at that point. The 
actuator cannot be placed at node 1 because there is no room. Note that different design criterion 
results in different locations of the actuator. Placing the actuator at node 10 doesn’t reduce the 
vibration at node 1 at all. Nodes 2,3, and 4 have the same effect in reducing the vibration level of 
node 1. But the first structural mode is shifted at 0.92 Hz. This was considered undesirable 
because it is hard to control the low frequencies by active control. Placing the actuator at nodes 6,7 
and 8 has the same effect in reducing the vibration level of node 1 and the first structural mode is 
not shifted considerably. Therefore, any of nodes 6,7, and 8 can be used as an ’’optimal” location 
of the actuator. The results that follow are for placing the actuator at node 8. 
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Table 2 : Comparison of the theoretical and experimental natural frequencies of 


the structure with and without the parasitic mass. 


FEM 

TEST I — 

w/o w 

w/o w dead mass w PMA inactive 



1 

1.04 

0.97 

1.07 

1.01 

1.02 

2 

3.47 

2.94 

3.54 

3.09 

2.96 

3 

8.05 

8.00 

7.94 

7.69 

7.88 

4 

19.90 

16.42 

- 

17.01 

16.03 

5 

21.75 

21.44 

- 


22.39 

6 

22.09 

22.06 

22.54 

22.02 

23.50 

7 

30.48 

28.53 

32.61 

30.08 

29.50 

8 

39.33 

39.12 

40.35 

39.78 

39.33 

9 

48.55 

46.40 

- 

- 

- 

10 

51.84 

51.45 

52.51 

49.31 

50.68 

11 

58.72 

58.52 

61.41 

54.57 

57.36 

12 

71.27 

70.71 

65.62 

65.02 

66.29 

13 

72.28 

72.28 

78.24 

77.73 

78.41 

14 

80.92 

80.74 

91.74 

84.8 

- 


Table 3 : Comparison of the theoretical natural frequencies of the structure with 


the parasitic mass at various nodal points. 


| " FEM 

— 

w/o 

8 

IT 

3 

4 

5 

6 

7 

■P.-J 

Frequency in Hz I 


i 

1.04 

0.97 

0.93 

0.93 

0.92 

0.98 

0.98 

0.98 

1.01 



2 

3.47 

2.94 

3.39 

3.40 

2.94 

2.96 

3.41 

3.40 

3.42 



3 

8.05 

8.00 

7.71 

7.66 

7.65 

7.95 

7.93 

7.95 

7.28 



4 

19.90 

16.42 

18.25 

18.41 

17.47 

15.54 

19.84 

19.88 

19.52 



5 

21.75 

21.44 

21.74 

21.45 

20.17 

21.75 

20.52 

20.24 

20.52 



6 

22.09 

22.06 

21.98 

22.07 

21.77 

21.96 

21.94 

21.75 

22.00 



7 

30.48 

28.53 

30.09 

30.02 

29.60 

27.79 

30.07 

30.43 

29.83 



8 

39.33 

39.12 

39.17 

38.15 

37.87 

36.87 

39.30 

38.92 

37.06 



9 

48.55 

46.40 

45.12 

46.65 

48.35 

48.35 

43.27 

45.40 

43.03 



10 

51.84 

51.45 

51.67 

49.02 

49.76 

50.89 

49.56 

51.40 

51.83 



11 

58.72 

58.52 

54.15 

57.71 

58.47 

58.54 

58.60 

56.68 

56.07 



12 

71.27 

70.71 

68.85 

68.34 

70.27 

70.62 

67.91 

68.31 

68.53 



13 

72.28 

72.28 

71.87 

72.26 

72.26 

72.09 

71.67 

72.23 

71.34 



14 

80.92 

80.74 

80.44 

80.13 

80.69 

80.67 

79.27 

79.27 

77.81 



3.2 Vibration absorber design 

There are several criteria for tuning the absorber to a MDOF structure. The simplest 
criterion is to tune the natural frequency of the absorber to exactly one of the natural frequencies of 
the structured that is: 

to a - coj (4a) 

The design of the damped absorber results in an optimal tuned frequency given by d 


<*a“ 


1+Pi 


(4b) 
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where m is the ratio of the mass of the absorber (here, the proof mass) over the mass of the SDOF 

structure (here, the modal mass at mode to). The ratio \i t or the modal mass can be calculated in a 
trial and error procedure. The difficulty of applying the second method is the fact that it is difficult 

to determine the optimal value for \x for the higher modes 22. 

An optimal tuning criterion for MDOF systems was presented in reference [19]. The 

absorber frequency (co a ) and damping coefficient (c a ) are given by: 

1+Pt 


<o 


2 = 


00/ 


C a 2 = m a 2 00 i 2 p ; 


(1 + M t + M a ) 2 
l+Ht 


where, 


0 + 4t + M a ) 3 


(5a) 


(5b) 


Mt " m t <^ 2 and p a = nyfc. 2 (5c) 

The scalars m t and m a are the parasitic mass and the mass of the absorber, respectively, and the 
scalar 4>j. is the jth entry of the associated eigenvector of the ith mode, where j is the degree of 

freedom corresponding to the location of the absorber. Note that the eigenvectors derived form the 
finite element model, are normalized with respect to the mass matrix. 


3.2.2 Experimental implementation of the passive control design 

The stiffness of the PMA can be electronically varied, such that the actuator system can be 
tuned to different frequencies. The PMA was attached to ground, and the LVDT signal was 
examined for random signal input that generates an electromagnetic force on the proof mass. The 
LVDT signal gives the relative position of the proof mass with respect to the housing of the 
actuator. As it can be clearly seen in the experimental bode plot in fig. 5, the PMA system is well 
modelled by a SDOF system, with a natural frequency depending on the gain that determines the 

electronic stiffness. The stiffness is a function of the external gain (a), and other electromagnetic 
constants of the coil and the amplifier (included in the factor K) . The natural frequency of the 
system is given by: 

u) a =b7TrVaK/m prf (6) 

The damping in the actuator was identified as Coulomb damping due to the friction in the 
bearings. An equivalent viscous coefficient was calculated from the frequency response functions 
of the LVDT signal at particular tuning frequencies. It was found that the lower the tuning 
frequency becomes, the higher the equivalent damping becomes. This is actually due to the fact 
that at low frequencies the proof mass of the actuator cannot overcome the friction. As a 
consequence, the natural frequency of the SDOF model of the actuator dynamics cannot go lower 
than a certain frequency, since the stiffness is electronically determined and it depends on the 
relative motion of the proof mass with respect to the housing of the actuator. It was found that the 
actuator system behaves like an overdamped system when tuned to frequencies below 8 Hz. 
Therefore, it was practically impossible to tune the actuator to frequencies lower than 8 Hz. Note 
that, this range includes the three lower natural frequencies of the modified structure. Therefore, 
the PMA is tuned to the fourth mode, by using the criteria described above. The results from only 
the second criterion are presented here in the top part of fig.6, due to the fact that the plots from the 
simple criterion (equation 4a) and the optimal tuning criterion (equation 5) were very similar. It can 
be clearly seen that the vibration response is clearly reduced. 


4. Active Control design 

The active control law is implemented, by using one actuator and two sensors. The force 
generator signal of the actuator was then given by: 
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f g = FCy(t) (7) 

where F the feedback gain matrix and C the output matrix. The sensors were placed at node 1 and 
node 4 as indicated in fig. 1 . Node 1 was chosen because this is the possible point of attachment of 
a sensitive device, where the vibration level is required to be reduced. Node 4 was chosen, 
because it moves in the opposite direction of node 1, when the structure is excited at one of its 
rotational modes. Here, accelerometers were used and their signals were integrated once by an 
analog computer, to give the corresponding velocity signals. The output position matrix was 
therefore zero, and the velocity output matrix was of the form: 

c *-[o.« i«:.v, 4 ] (8) 

The gain matrix is therefore given by: 

F “ Igi : g 2 l ( 9 ) 

where g, and g 2 are the two gains to be determined. Substituting into the previous equation results 
in: 

f «- F [o,l, 1 S:,V, 4 >« CO) 

The closed-loop system written in physical coordinate system, is given by the following equation: 


Mo L q(t) +D 0 Lq(t) + KQ L q(t) = B 0 LFCjq(t) (11) 

The objective here is to calculate the gain matrix F such that the system has poles at the desired 
locations. The right hand side of the previous equation is expanded as: 

-O-i r 0 7xl5 

R Fr = 1 Iff • <r if 1 0 ixi4] = 8i 0 0 & 0 lxll . 

BqlFC, o lgi.g 2 l[ 0 lx3 1 0 lxl , J ° 6xl 5 (12) 

L-iJ L-gj o o -g 2 o, xl ,J 

Note that this is a square sparse asymmetric matrix with only four non-zero elements. This results 
in a closed-loop system damping matrix of the form: 

“D, 0*i r 0 7xl5 

Dcl- "r + El0 0 0 g2 ° lxl1 (13) 

U 6 X 1 5 

L0-C ac t0CacJ L-g,0 0-g 2 o , x , J J 


where c act , corresponds to the equivalent viscous damping coefficient of the actuator system. 

The objective here, was to decrease the amplitude of the vibration response at the low 
modes that have high participation factors. Note that, direct pole placement design could not be 
applied since with one actuator and two sensors, only one closed-loop pole can be placed. The 
gains were determined in an ad hoc design, from an algorithm that covered a broad region of 
values, with the main objective to move the lower two poles further in the LHS complex plane. 
The results are presented in table 6. It can be clearly seen that the closed-loop system is stable 
when the two gains g, and g 2 , are in the region -10 to 10 and 0 to 15 respectively. A finer grid 
that covered the part of the stable region, where the damping of the first two modes was increased 
(gj from 0 to 10 and g 2 from 10 to 20) was also examined 22. 

It was discovered that the "optimal ” gain of F= [5 : 15] increases the damping on modes 
1, 2, 4, 5, 6 and decreases the damping at mode 3. Note that, further increase of the gains towards 
the "optimal” direction, resulted in an unstable closed-loop system. The experimentally obtained 
transfer functions of nodes 1 and 8, are presented in fig.6, and they are compared directly with the 
open-loop system, tuned to the fourth structural mode. The results show clearly, a decrease in the 
response at modes 1 and 2. The decrease of the vibration response is not very large as desired, 
because of the following reasons: 

(i) By using only one actuator and two sensors, we can only affect 4 elements of the 15x15 closed- 
loop damping matrix. 

(ii) Further increase in the gains towards the "optimal ” direction drives the third mode unstable. 
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(iii) We are trying to control a flexible structure with many significant modes that cannot be 
ignored. 

(iv) We are only using velocity feedback 

It was also illustrated experimentally that by increasing the gains at higher values drove the 
proof mass system unstable. 



U = unstable, S = stable. 

5. Closing Remarks 

An experimental flexible planar truss structure was modelled and successfully controlled in 
a passive and active way by using a space realizable linear proof mass actuator system. The PMA 
was attached to the truss at a desired location, and tuned as traditional vibration absorber to one of 
the structural modes of the truss by using several criteria. The actuator dynamics were 
successfully modelled and taken into consideration in the design of the passive and active control 
law. The active control design was adopted in the form of output velocity feedback by integrating 
the signals of two accelerometers, attached to the structure. The limitations of this method were 
indicated and difficulties of applying output feedback on large flexible structures with several 
significant modes are identified and pointed out. 
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Simulation Studies Using Multibody Dynamics Code DART 

James E. Keat 

Photon Research Associates, Inc. 


Abstract 

DART is a multibody dynamics code developed by Photon Research 
Associates for the Air Force Astronautics Laboratory (AFAL). The 
code is intended primarily to simulate the dynamics of large space 
structures, particularly during the deployment phase of their 
missions. DART integrates nonlinear equations of motion 
numerically. The number of bodies in the system being simulated is 
arbitrary. The bodies' interconnection joints can have an arbitrary 
number of degrees of freedom between 0 and 6. Motions across the 
joints can be large. Provision for simulating on-board control 
systems is provided. Conservation of energy and momentum, when 
applicable, are used to evaluate DART'S performance. 

After a brief description of DART, the paper describes studies 
made to test the program prior to its delivery to AFAL. Three 
studies are described. The first is a large angle reorientating of a 
flexible spacecraft consisting of a rigid central hub and four flexible 
booms. Reorientation was accomplished by a single-cycle sine wave 
shape torque input. In the second study, an appendage, mounted on a 
spacecraft, was slewed through a large angle. Four closed-loop 
control systems provided control of this appendage and of the 
spacecraft's attitude. The third study simulated the deployment of 
the rim of a bicycle wheel configuration large space structure. This 
system contained 18 bodies. An interesting and unexpected feature 
of the dynamics was a pulsing phenomena experienced by the stays 
whose playout was used to control the deployment. 

The paper concludes with a short description of the current 
status of DART. 
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Abstract 

For future in-space construction facility, one or more space 
cranes capable of manipulating and positioning large and massive 
spacecraft components will be needed. Because the space systems 
being constructed are relatively large and massive, the space cranes 
must have a reach on the order of 100-meter and be made of truss- 
type construction for structural efficiency. In order to optimize space 
crane's performance, an operational strategy consisting of gross- 
motion and fine-motion phases was proposed. Under this strategy, a 
space crane is commanded into position in a relatively fast pre- 
planned trajectory with relaxed requirements, and then "rigidized" 
by bracing against either the workpiece or an auxiliary support 
structure. After bracing, the subsequent fine motion will not involve 
the major crane bodies, and the precision movements between the 
workpieces can be performed without the adverse flexible crane 
body effect. 

Inverse dynamics has been extensively studied as a basis for 
trajectory generation and control of robot manipulators. This paper 
will focus on trajectory generation in the gross-motion phase of space 
crane operation. Inverse dynamics of the flexible crane body is much 
more complex and intricate as compared with a rigid robot link. To 
model and solve the space crane's inverse dynamics problem, 
LATDYN program which employs a three-dimensional finite element 
formulation for the multibody truss-type structures will be used. 
The formulation is oriented toward a joint dominated structure which 
is suitable for the proposed space crane concept. To track a planned 
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trajectory, procedures will be developed to obtain the actuation 
profile and dynamics envelope which are pertinent to the design and 
performance requirements of the space crane concept. 
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MINIMUM ATTAINABLE RMS ATTITUDE ERROR 
USING CO-LOCATED RATE SENSORS 


A. V. Balakrishnant 


Abstract 

In this paper we announce a closed form analytical expression for the minimum attain- 
able attitude error (as well as the error rate) in a flexible beam by feedback control using 
co-located rate sensors. For simplicity, we consider a beam clamped at one end with an 
offset mass (antenna) at the other end where the controls and sensors are located. Both 
control moment generators and force actuators are provided. The results apply to any beam- 
like lattice-type truss, and provide the kind of performance criteria needed under CSI — 
Controls-Structures-Integrated optimization. 


t Research Supported in part under NAS1-18585 Task Assignment 49. 
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1. Introduction 


One of the challenges in the Design Challenge For Flexible Flight Structure Control 
System Design formulated in the inaugural paper on SCOLE [1] was to hold the antenna 
pointing error within ±0.02 degrees after slewing by appropriate feedback control. In this 
paper we derive a closed form expression for the minimal achievable mean square pointing 
error using co-located rate sensors. A slightly simplified form of the SCOLE article (which 
eliminates rigid-body modes) is used: a cantilevered beam with an offset mass where the 
controls — both c.m.g.'s and force actuators — and the rate sensors are located. Our results 
are in terms of continuum model parameters — the uniform Bernoulli version is used. The 
beam dynamics are given in Section 2. The main results are in Section 3. We note that a 
technique for deriving equivalent Bernoulli beam parameters for various types of trusses is 
described by Noor and Anderson in [4]. Recently Noor and Russell [5] presented equivalent 
anisotropic Timoshenko beam models for beam-like lattice trusses with an arbitrary degree of 
modal coupling, which appear to yield excellent agreement with modal frequencies derived 
from finite element models. Our theory is able to handle these Timoshenko models, and 
moreover we can also use it for rigid-body modes, although they are not included here. Thus 
our results can be used for any beam-like lattice truss structure. 
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2. The Model 


We consider a uniform Bernoulli beam clamped at one end with an offset mass (antenna) 
at the other end which also houses the sensors and actuators. See Figure 1. We allow for both 
force actuators and moment actuators. The sensors are rate gyros. Because of the clamping at 
one end, no rigid-body modes are involved and hence no attitude sensors are needed. 

We allow bending in two mutually perpendicular planes containing the beam axis, as 
well as torsion in the plane perpendicular to the beam axis, all uncoupled. The continuum 
model (uniform Beroulli beam) dynamics can then be described by the following partial differ- 
ential equations (similar to those in [2, 3]). Let the beam extend along the z-axis, 0 < s < L, 
and let M s > t), u Q (s, t), denote the bending displacements and u y (s, t) the torsion angle 
about the beam axis. Let in the usual notation (cf. [1]), £7^, £7 e denote the flexural stiffness 
and GI W the torsional rigidity. Let p denote the mass per unit area and A the cross-sectional 

-0, 0<s<L;0<t 

-0, 0 < s < L-, 0 < t 

d^iivis, t ) 

p/ y — - G/y (s, /) « 0 , 0<s<L;0<t 

with the clamped boundaty conditions at s - 0 : 



<Pu*(s, t ) 

d*Uj(s, t) 

pA 

-fc— 

+ £ '» 


5 2 «e( J > 0 

,,, ^Mj, 0 

pA 

dt 2 

+ £/ e ds* 


MO, r) = u e (0, 0 - «y(0, /) = 0 

MO, 0 « MO, r) - 0 . 

The antenna center of gravity is located at 

(r*, • 

The distance from the beam tip to antenna center of gravity is denoted by 

M - Vr* + . 
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Figure 1 


i 


: Shuttle/Antenna Configuration 
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The force balance equation at s = L yields 


m 


1 0 
0 1 


fi* (L, t) 

«e ih 0 

«y(£, 0 


/l(0 

/lW 


EI<u;\L,t) 
EIqUq\L, t) 


where m is the antenna mass and /i( ), / 2 ( ) are the applied control forces. The torque 


balance equations yield 


0 = 

El^u;'(L,t) 

£/ e Mg (L, f) 

+ i a © + Af(f) + r ® 

fi(t) 

hit) 


G/ y Uy (L, t) 


0 


+ r ® 


t) + r x Uy(L, t ) 
ii e (L,r) + r y u^(L,t) 


where the superdots indicate time derivatives and the primes the derivatives with respect to 
the spatial variable s ; ® denotes the vector cross-product and © the angular rate vector 


© 


My ( L , t ) 


I a denotes the moment of inertia of the antenna about the beam tip (s - L) and finally, M(t) 
denotes the applied control moment. 

It is convenient to denote by 6(f) the boundary vector: 


6(f) 


(L, f) 
M 0 ' (L, t) 

My ( L , t) 


The boundary rate vector would thus be 6(f) . Hence our sensor model is: 

v(0 = 6(f) + N 0 (t) 

where we assume that N 0 (t) is white Gaussian noise with spectral density matrix d Q l, where 
/ is the identity (5x5) matrix. Similarly we assume that the control actuators are also 
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characterized by additive white Gaussian noise. Denoting the applied control vector by u(t) : 

u\(L,t) 

u 2 (L, t) 

u(t) = u 3 (L,t) . 

u 4 (L, t) 

«5 (L, 0 

we have 

/ 1(0 

flit) - u(t) + N s (t) 

M(t) 

where N s (t) is white Gaussian with spectral density d s . We shall also use M b to denote the 
actuator mass/inertia matrix 

m 0 0 0 mr x 

0 m 0 0 mr y 

M b - 0 0 

0 0 la 

mr x mr y 

where 

r) -r x r y 0 

h “ L + ~ r x r y r l 0 

0 0 + 

where l a is the antenna moment of inertia about its center of gravity. For any control input 

u(-) (which must perforce be a “feedback” control, based on the sensor data v(-)) the mean 

square pointing error is then expressed by: 

, f T T T 1 

lim =;j f (L, tf dt + J « e (L, t ) 1 dt + \r \ 2 J u^(L, t ) 2 dt> 

0 0 i 

and the mean square pointing rate is given by 
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! r T T 

1 im = J w 0 (L, r ) 2 dr + J «e(L, r ) 2 dr + |r | 2 f u^(L, r ) 2 dt h . 

t '-* 00 1 0 0 0 - 

From the results in [6] it follows that the minimal attainable mean square pointing error is 
given by 

aM~ b 'a* 

where 

a « row vector (1, 1, 0, 0, |r|) 
a * - transpose of a 


3. Main Results 

We need some notation first. The mean square attitude response, whatever the feed- 
back control used is defined by 

{ T T T 1 

f Uq (t, l ) 2 dt + J u e (r, l ) 2 dt + |r | 2 J* K v (r, l ) 2 dt\ . ( 3 . 1 ) 

oo o J 

This is recognized as the mean square displacement of the center of gravity of the antenna 

which is then also proportional to the mean square “pointing” error — see [ 1 ] for the 

relationships. 

Next let u denote any (vector) of control inputs — a constant “step” input: 


u 


«i 

“2 

“3 

“4 

“5 


(3.1) 


Solve the equations 


EW"(s) = 0 1 
E/ e u S'\s) = 0 • 
Gl v u{(s) - 0 j 


0 < s < L , 


(3.2) 
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subject to the end conditions 


E1)U;\L) - u, 

ei^xl) - « 2 

EI^(L) + u 3 - 0 • 

El ML) + «4 - 0 
GlyU^(L) + «5 - 0 - 


(3.3) 


Note that the solution can be recognized as the steady-state response of the system to the 
step-input u, assuming that there is some damping. We only need to calculate the response to 
three specialized inputs: 

Calculate the response to u^(L) to the special case. Case 1, where: 

Ui - 1 

u, - 0 , 2 £ i <. 5 . 

Calculate the response u 0 (L) to the special case. Case 2, where: 

U) « 0 

u 2 - l 

U 3 « «4 - «5 - 0 . 

Calculate the response ( L ) to the special case, Case 3, where 

U] - « 2 “ “3 “ u * ~ 0 
« 6 - 1 • 

Then the minimal achievable mean-square response whatever the choice of the feedback and 
whatever the mean-square control effort, is given by 

^d s d 0 (iq, (L) 2 + Uq(L) 2 + i^UyiL) 2 ) . (3.4) 

This is our main result. Unfortunately the derivation is beyond the scope of this report and 
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will be published elsewhere. To proceed further with (3.4) we calculate the solution of (3.2), 
(3.3) explicitly. Thus for any u, the solution is of die form 


where 


Thus for Case 1 we have 


and for Case 2 we have 


and for Case 3: 


«<,(*) - 03 * 3 + » 

u Q (s) - bys 3 + bis 1 , 

Uy(s) - C|5 , 


*3 


- 

6 E/ 0 


a 2 = 


1 

\JZ- + 

a, l 

2 

[El* + 

El*. 

1 

r-*SL. + 

u,L 

2 

[ei* + 

El*. 

JS_ 




u*(L ) 2 


l? 

3 £/ e 


u^(L) 2 

Hence the mean-square attitude error 


L 

g/ ¥ ■ 


0 < s < L 
0 < s < L 
0 < s < L 


JdT\JL. + 

10 [3£/^ + 3 E/ e G/ v J • 


(3-5) 


Note the appearance of the noise parameters in (3.5) in product form. 

The technique for calculating the minimal mean square attdude error in more complex 
models than that illustrated is the same: calculate the mean square step response (assuming 
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some damping) to unit step inputs. 

In conclusion we suggest this result (3.5) can be the basis for combined structures- 
controls optimization — CSI, since the required structural parameters can be calculated for a 
lattice truss from the material gage and physical dimensions as in [4, 5]. We omit the details 
of these calculations. 
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ABSTRACT 

The kinematics, dynamics, Jacobian, and their corresponding inverse computations are six essential problems 
in the control of robot manip ulat ors. Efficient parallel algorithms for these computations are discussed and 
analyzed. Their characteristics are identified and a scheme on the mapping of these algorithms to a reconfigurable 
parallel architecture is presented. Based on the characteristics including type of parallelism, degree of parallelism, 
uniformity of the operations, fundamental operations, data dependencies, and communication requirement, it is 
shown that most of the algorithms for robotic computations possess highly regular properties and some common 
structures, especially the linear recursive structure. Moreover, they are well-suited to be implemented on a single- 
i ns true lion -stream multiple-data-stream (SIMD) computer with reconfigurable interconnection network. The model 
of a reconfigurable dual network SIMD machine with internal direct feedback is introduced. A systematic pro- 
cedure to map these computations to the proposed machine is presented. A new scheduling problem for SIMD 
machines is investigated and a heuristic algorithm, called neighborhood scheduling, that reorders the processing 
of subtasks to reduce the communication time is described. Mapping results of a benchmark algorithm are 
illustrated and discussed. 

1. Introduction 

Robot manipulators are highly nonlinear systems and their dynamic performance is directly dependent on the 
efficiency of the kinematic and dynamic models, the control schemes/algorithms, and the computer architecture for 
computing the control schemes. In general, robot manipulators are usually servoed in the joint-variable space while 
the objects to be manipulated are usually expressed in the world (or Cartesian) coordinate system. In order to con- 
trol the position and orientation of the manipulator end-effector, the robot controller is required to compute, at a 
sufficient rate, such tasks as coordinate transformation between the joint-variable space and the Cartesian space, 
generalized forces/torques to drive the joint motors, the manipulator inertia matrix for model-based control schemes, 
and the Jacobian matrix which relates the joint velocity in the joint-variable space to the Cartesian space. These are 
the basic robotic computations for the control of robot manipulators. They are equivalent to the computations of 
kinematics, dynamics, Jacobian, and their corresponding inverses. These six basic robotics computations are 
required at various stages of robot arm control and computer simulation of robot motion, and reveal a basic charac- 
teristic and common problem in robot manipulator control — intensive computations with a high level of data 
dependency. They have become major computational bottlenecks in the control of robot manipulators. Despite 
their impressive speed, conventional general-purpose uniprocessor computers cannot efficiently handle the kinemat- 
ics and dynamics computations at the required computation rate because their architectures limit them to a mostly 
serial approach to computation. Furthermore, less efficient, serial computational algorithms must be used to com- 
pute these robotics computations on a uniprocessor computer. Consequently, the quest for real-time robot arm con- 
trol and motion simulation rests on the study and development of parallel algorithms of lower computational com- 
plexity with faster computational structures. The ultimate goal is to achieve an order-of-magnitude and/or an 
order-of-complexity improvement in computational efficiency in these robotics computations by taking advantage of 
parallelism, pipelining, and architectures. 

A common feature of today’s research on robotic computational problems is that a specific problem, mostly 
the inverse dynamics or the inverse kinematics, is studied at a time, and usually an algorithmically-specialized 
architecture or processor is developed for that particular algorithm. Obviously, this specialized architecture can 
make the most use of the parallel properties of the algorithm. However, most advanced robot control schemes 
always require to solve a combination of some or all of the six basic robotic computations. One solution for this 
problem is to wire these specialized architectures or processors together. This method is inflexible because the com- 
bination of these components is dedicated to a particular control scheme and cannot be used efficiently for another 
scheme. Another solution is to connect the architectures or processors to a bus as peripherals of a general-purpose 
computer. This is more flexible, but the bus becomes a bottleneck and time is wasted in data movements between 
different computational processes. Another possible solution is focussed on partitioning the original algorithm/task 
into a set of subtasks with precedence relationship and then developing efficient scheduling algorithms to map these 
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subtasks onto a general-purpose multiprocessor system. This solution is much more flexible because most computa- 
tional algorithms can be represented by directed task graphs. However, this approach may result in ignoring some 
inherent parallelism in robotics algorithms. 

In this paper, we shall address these robotic computational problems, and major effort is focussed on finding a 
scheme which provides the flexibility needed to solve robotic computational problems on the same architecture 
while maintaining high efficiency by taking into account the inherent parallelism of robotics algorithms. To exploit 
the inherent parallelism of these robotics algorithms, our approach is first to characterize the set of parallel robotic 
algorithms based on the six specified characteristics and features, including type of parallelism, degree of parallel- 
ism, uniformity of the operations, fundamental operations, data dependency, and communication requirement. Our 
analysis shows that machines operating in the single-instruction-stream multiple-data-stream (SIMD) mode are the 
most efficient and suitable for our robotic algorithms. By fully considering the common characteristics and inherent 
parallelism of the robotics algorithms, a prototype of a medium-grained, reconfigurable, dual-network, SIMD 
machine with internal direct feedback has been designed for the computation of these kinematic and dynamic com- 
putational tasks. A systematic mapping procedure has been developed for scheduling these robotic computational 
tasks onto the proposed SIMD machine. This procedure builds a task table which contains the subtask assignment 
from the original parallel algorithm. Then a simplified task table and an input table are produced through the nota- 
tion simplification. These two tables are then used as inputs to the neighborhood scheduling algorithm which reord- 
ers the processing sequence of the subtasks into a rescheduled task table to reduce the communication time. Finally, 
the subtasks in this rescheduled task table are mapped onto the proposed SIMD machine and a control table which 
describes the control sequence in the machine is produced. A benchmark algorithm which contains the characteris- 
tics of the six basic robotic computations has been implemented on the proposed SIMD machine, and the mapping 
results are included for discussion. 

2. Characteristics of Parallel Algorithms 

A key factor to the design of a parallel architecture for a group of algorithms is the understanding of their 
architectural requirements, and this requires us to identify the characteristics of these algorithms. This identification 
is usually helpful because the algorithms from a given application area such as robotics often possess an identifiable 
structure. In order to examine the characteristics of the six basic robotics parallel algorithms, a set of features which 
have the greatest effects on the execution of parallel algorithms is defined for robotics application [1]. 

■ Type of parallelism. Two levels of parallelism can be identified. 

(a) Job-level parallelism . The original algorithm is reformulated to a parallel processable form. In this level, 
the variables carrying the same kind of information but with different indices (e.g., for different links or 
joints of a manipulator) are processed parallelly. Due to the nature of the robot’s serial link structure, vari- 
ables representing the same physical meaning are defined for each link such as joint velocities, joint 
accelerations, and joint torques. Usually, the same class of variables are produced through an identical 
computational procedure but with different set of data. This property is called uniformity of operations as 
defined below. So the job-level parallelism will often be amenable to the SIMD implementation and usu- 
ally the required number of processors depends on the number of degrees of freedom of the manipulator 
(i.e., one processor for each joint). 

(b) Task-level parallelism . The original algorithm is decomposed into multiple subtasks. While the computa- 
tion within a subtask is serial, the number of subtasks that can be processed concurrently is maximized by 
using some scheduling techniques. Obviously, this implies multiple-instruction-stream multiple-data- 
stream (MIMD) operations. Furthermore, for this level of parallelism, a subtask usually performs the same 
computation for different set of data, and hence the operation can be pipelined. An advantage of this task- 
level parallelism is that the required number of processors is independent of the number of degrees of free- 
dom of the manipulator. 

■ Degree of parallelism (Granularity). Three levels of granularity are distinguished. In the large grain granular- 
ity, the parallelism is performed at the algorithmic level. That is, only the parallelism between different seg- 
ments or subtasks is considered. For the medium grain granularity , the concurrency is considered at the opera- 
tion level and the parallelism is performed based on some basic mathematical operations such as vector cross 
product and matrix-vector multiplication. If we consider the implementation of parallelism within the basic 
arithmetic operations, then the fine grain granularity is achieved. Different degrees of parallelism often imply 
different synchronization requirements. The finer the granularity is, the more frequent synchronization is 
required. 

■ Uniformity of operations. A robotics algorithm is said to possess uniformity of operations if the required com- 
putations for some set of variables, especially the joint variables, are uniform. An algorithm with operation uni- 
formity can be implemented on an SIMD machine with higher efficiency. 
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■ Fundamental operations. Algorithms in an application area usually perform similar mathematical operations. 
The identification of basic operations performed in the algorithm will dictate the processor capabilities needed. 

■ Data dependency. Three kinds of data dependency are classified for robotics algorithms: local neighborhood 
dependency, special type dependency, and global dependency. The local dependency means that the required 
operands in an operation come from its neighborhood; for example, from the results of last operation or using 
the same operands of last operation. The special type dependency is defined for some special equation or prob- 
lem. There are some special types of data dependency that are peculiar and inherent to the robotics algorithms. 
Among them, the homogeneous (or hetero-homogeneous) linear recursive type of dependency which describes 
the data dependency in a homogeneous (or hetero-homogeneous) linear recursive equation appears most fre- 
quently. This linear recurrence structure plays a major role in the robotics algorithms because die variables of a 
joint are usually related to the corresponding variables of its adjacent joint due to the robot s serial link structure. 
Other special types of data dependency are defined for some well-known problems; for example, system of 
linear equations and Column-Sweeping algorithm for a triangular linear system. The global dependency means 
that the results of some operations may be required by other operations or equations that may appear in other 
places of the algorithm. Since few algorithms possess absolutely one kind of data dependency, we can just iden- 
tify whether an algorithm is local data dependency oriented or not. The data dependency in an algorithm usually 
dictates memory organization, data allocation, and communication requirements. 

■ Communication requirement. The communication requirement decides the required interconnection type 
between processor and processor or between processor and memory. Three types of interconnection are con- 
sidered: one-to-one connection, permutation and broadcast connections. Of course, the exact required inter- 
connection type for each computation in an algorithm depends on many factors such as task assignment of each 
processor, data allocation in the memories, and data dependency of each computation. Hence, the exact 
required interconnection type can only be decided at the time of the algorithm-architecture mapping process. In 
examining the features of robotics parallel algorithms, only rough connection requirements can be observed. 

3. Characterization of Basic Robotics Parallel Algorithms 

Based on the above set of features, each of the six basic robotics algorithms have been carefully examined 
and analyzed to find the common features and characteristics among them [2]. Only the final results are presented 
here, which are useful for better understanding of the robotics computations and to designing a suitable parallel 
architecture for their computations. 

Invars* Dynamics Problem. Among various methods for computing the inverse dynamics problem, the one based 
on the Newton-Euler (NE) equations of modem is the most efficient [3]. Since this method has been shown to pos- 
sess the time lower bound of 0(n) running on uniprocessor computers, where n is the number of degrees-of- 
freedom of the manipulator, further substantial improvements in computational efficiency appear unlikely. 
Nevertheless, some improvements could be achieved by taking advantage of particular computation structures [4], 
customized algorithms/architectures for specific manipulators [5], parallel computations [6,7], and scheduling algo- 
rithms for multiprocessor systems [8-1 1]. 

Forward Dynamics Problem. Among various methods to solving the forward dynamics problem [12-14], the 
composite rigid-body method [12], based on the computation of the NE equations of motion, is widely used to 
develop efficient parallel algorithms [14-16]. The composite rigid-body method is suitable for parallel processing 
because efficient parallel algorithms for the inverse dynamics computation have been well developed and can be 
used to speed up the computation time. 

Forward Klnamatic* Problem. Using the Denavit-Hartenberg matrix representation to establishing the link coor- 
dinate frames [17,18], the solution to the forward kinematics problem is the successive multiplication of the 4x4 
homogeneous link transformation matrices for an n-link manipulator 

t = a£a?a! • • • a}_i • • • a;_, (i) 

where A|_! is the D-H link transformation matrix which relates the ith coordinate frame to the (i-l)th coordinate 
frame [17,18]. The above successive matrix multiplication equation can be reformulated in a homogeneous linear 
recursive form 

Tj = Aj and T‘ 0 = Tff 1 AU to i = 2, • • • ,n , (2) 

from which the configuration of all the coordinate frames can be obtained at the time lower bound [7,19,20]. 

Forward Jacobian Problam. Existing methods in computing the Jacobian are mostly confined to uniprocessor 
computers. In particular, Orin/Schrader [21], and Yeung/Lee [22] exploited the linear recurrence characteristics of 
the Jacobian eq uations. These methods differed from each other only by a different selection of the reference 
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coordinate frame for computation. The reference coordinate frame is selected such that all the vectors and matrices 
and the Jacobian computed are referred to that reference coordinate system. They all have the computational order 
of 0(n ) for an n-jointed manipulator. 

Inverse Jacobian Problem. The inverse Jacobian algorithms for a general manipulator can be divided into two 
categories. One is to calculate the inverse or the generalized inverse Jacobian explicitly [23]. The other is to con- 
sider the inverse Jacobian problem as a system of linear equations and solve the joint rate from the Cartesian velo- 
city implicitly [24]. For practical purposes, the latter approach is easier to be parallelized due to the use of some 
standard techniques to solve a system of linear equations such as the Gaussian elimination method. 

Inverse Kinematics Problem. In general, the inverse kinematic position solution can be obtained by various tech- 
niques [18], among which the inverse transform [25] and the iterative method [26] are widely discussed. The 
inverse transform technique yields a set of explicit, non-iterative joint angle equations which involve multiplica- 
tions, additions, square root, and transcendental function operations. The iterative methods can obtain robot 
independent joint solution, but they usually have some disadvantages: more computations than the closed-form 
solution, variable computation time and, more important, convergence problem, especially in the singular and 
degenerate cases. We shall examine the characteristics of the inverse transform technique and the iterative methods. 

The equations for closed-form solution appear highly non-uniform [27]. To achieve higher parallelism for the 
inverse kinematics problem, the iterative method provides a better approach, since nearly every presented iterative 
method contains the computations of forward kinematics, forward Jacobian, and inverse Jacobian [26], which have 
been shown to be highly parallelized. 

If we consider these six basic robotics computations as a set of tasks that we need to compute for the control 
of robot manipulators, then we need to find their common features and characteristics so that a parallel architecture 
can be designed to efficiently compute these tasks. The characteristics of the six basic robotics algorithms are tabu- 
lated in Table 1 and it shows that these algorithms do possess some important common features and characteristics. 
This is especially true for the inverse dynamics, the forward dynamics, the forward kinematics, and the forward 
Jacobian computations for the following three reasons. First, they are all suitable to be parallelized at the job-level 
and the parallelization can be performed at the large, medium, and fine grain granularities simultaneously, although 
different granularities are emphasized in each individual algorithm. Second, their operations are all uniform for the 
variables corresponding to each joint, and the most important fundamental operation is the matrix-vector operation. 
Finally, the strongest common feature is that they are all in homogeneous linear recursive form, for which the recur- 
sive doubling technique can be applied to achieve the time lower bound of 0(r^og 2 /i] ). The communication 
requirement indicates that one-to-one and some regular or irregular permutation capabilities are required for these 
four computational problems and the broadcast capability is necessary for the forward dynamics and the forward 
Jacobian algorithms. This indicates that some efficient, versatile network is required in the parallel architecture for 
their computations. 

The inverse Jacobian and the inverse kinematics computations may seem less common to the above four algo- 
rithms. However, if less efficient methods to solve these two problems are chosen individually, then these two algo- 
rithms may possess some common features to the other four algorithms, and a common parallel architecture can be 
designed to match all these common characteristics for their computations. From previous discussions, we found 
that either the direct method or the iterative method for the inverse Jacobian is a proper candidate for parallel pro- 
cessing, while the direct method is more efficient with somewhat complex data dependencies. For the inverse 
kinematics problem, only the iterative method possesses regular properties similar to the other four computations. 

With all the characteristics listed in Table 1, we shall next examine how to reformulate and parallelize these 
robotics algorithms from their original serial algorithms by complying to their common features [2], The paralleli- 
zation process is performed at the job level; that is, we try to express the original algorithms as a sequence of serial 
steps (jobs). Each individual step is accomplished through the cooperation of all the processors and for each step, 
the operations of each processor are almost identical by using one of their common features: the uniformity of 
operations. Hence, each step can be considered to be a single instruction in a serial program. Two different steps 
(or jobs) are identified after the parallelization process: single steps and macro steps. The notion of “single instruc- 
tion” and “subroutine” of a serial program can be used to distinguish between these two steps. A single step 
corresponds to a single instruction in a serial program, while a macro step corresponds to a subroutine in a serial 
program. The macro steps require more complex parallel computations for all the processors, for example, the 
homogeneous linear recursive equation, the hetero-homogeneous linear recursive equation, and the system of linear 
equations are all macro steps. These macro steps are identified by their completeness and repeatity. The complete- 
ness means that the step can be treated as an individual problem. The technique to process these macro steps paral- 
lelly needs special consideration and the algorithm to solve these steps is so well-structured that finer decomposition 
is not helpful or even impossible, for example, the parallel recursive doubling technique for solving the homogene- 
ous linear recursive equation, or the parallel Cholesky factorization technique for solving the system linear 
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equations with a symmetric-positive-definite square matrix. The repea tity means that the problem which can be 
solved in the step is so important and common that it appears repetitively at many other places; for example, many 
equations of robotics algorithms are in homogeneous linear recursive form, then the procedure for paralleliy solving 
this problem can be applied to all these places. The method to parallelize each of these macro steps is designed 
separately. 

Instead of computing all the six basic robotics algorithms, we synthesize a benchmark algorithm (see Table 2) 
which represents the general structure of the basic robotics parallel algorithms. This benchmark algorithm consists 
of six serial steps, and each step needs the cooperation of n processors. This benchmark algorithm will be used to 
demonstrate the whole process of mapping the "serial type" parallel algorithms onto a proposed parallel architecture 
in the following sections. 

4. Design of AlgorKhmically-Specialized Parallel Architecture 

In this section, an appropriate parallel architecture with the attributes that best match the common features of 
the six basic robotics parallel algorithms is designed. The important parallel architecture attributes include the type 
of machine (e.g., SIMD or MIMD mode), number of processors, synchronization requirement, processor capabili- 
ties, memory organization, and network requirement Each of these attributes is affected by one or more features of 
the six basic robotics algorithms discussed in section 3. Detailed consideration for the design of this machine can be 
found in [2]. With all these requirements and attributes, the appropriate parallel architecture is a reconfigurable, 
dual-network , SIMD (DN-SIMD) machine for the computation of robotic algorithms. 

The structure of the proposed DN-SIMD machine, as shown in Fig. 1, consists of multiple processing ele- 
ments, two reconfigurable interconnection networks (RIN1 and RIN2), a set of global data registers (GDRs), three 
data buffers including register output buffer, PE output buffer and input data buffer (IDB), and a set of multiplexers. 
Ail of these are coordinated by a central control unit (CU) which is not shown in Fig. 1. The functions of each ele- 
ment are briefly described here. 

1. Processing Element (PE). There are n identical PEs. Each PE is essentially an arithmetic logic unit (ALU) 
with attached working registers (see Fig. 1). All the ALUs perform the same programmable function synchro- 
nously in a lock-step fashion under the command of the CU. Some of the PEs can be masked (disabled) for 
some computation period, while other unmasked or enabled PEs perform computations. Each PE has two input 
working registers (IWRs) which are used to store two operands for each computation, and one output working 
register (OWR) which is used to store the current result of each computation. The operands in the IWRs are 
kept there until they are replaced. Thus, they can be used repetitively if one or two operands are common for a 
series of continuous computations. An inner loop connection within a PE is designed, which connects the 
OWR to one of the two IWRs. This provides an immediate inner-PE forwarding path such that the current 
result can be used as an operand for the next computation immediately. 

2. Global data registers (GDRs). There are n groups of data registers which correspond to the n global memory 
modules. In each computation period, the registers with the same relative position in each group can be 
accessed under the control of the CU. The result of each computation from each PE will be stored in the GDRs 
only when either the result is the final output or the result will be used in later computations but not the 
immediate following one, which can make use of the internal forwarding path for data exchange among PEs or 
inner loop within PEs. 

3. Reconfigurable interconnection networks. There are two sets of identical interconnection networks: RIN1 and 
RIN2. They are assumed to have full connectivity including one-to-one, permutation, and broadcast capabili- 
ties (e.g., the crossbar network). The RIN1 connects the GDRs to the PEs. This provides the paths for sending 
required operands to the appropriate PEs. The RIN2 makes the connection from the outputs of PEs to the 
inputs of PEs; this provides the direct paths for internal forwarding data exchange among PEs. It should be 
noted that, if necessary, the output of PE i can be stored into its corresponding memory module i. This is not 
affected by the RIN2. 

4. Data buffers. There are three sets of data buffers. The register output buffer allows the “current computa- 
tion” and the “RIN1 reconfiguration and operand fetch for the next computation” be processed at the same 
time. The PE output buffer allows the “current computation” and the “RIN2 reconfiguration and output data 
storing” be processed simultaneously. The input data buffer (IDB) is the buffer for operands directly from 
external input data. 

5. Multiplexer. The n multiplexers in advance of the n PEs are used to select proper operands to enter PEs from 
three possible sources: GDR, IDB, and IFD exchange. They are also under the control of the control unit 

With the functions of these elements described above, the basic mathematical operations performed by each 
PE of the DN-SIMD machine involve at most two operands. 
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T = A o B 


(3) 

where A and B are two arbitrary operands and they can be scalar, vector or matrix, and “o" indicates the operation 
performed by the PE, When either A or B is null, the computation only involves one operand such as the transpose 
of a matrix. The operands A or B may come from five different sources. They are GDRs through RIN 1 , IFD 
exchange through RIN2, IDB, IWR within PE, and OWR within PE through inner loop connection. The result T 
may be sent to two possible destinations: GDRs directly, or PEs through RIN2 via IFD exchange. The possible 
input operands and output result transfer path diagrams are illustrated in Figs. 2(a) and 2(b) respectively. In Fig. 
2(a), we demonstrate all the possible source combinations except the case that the operands come from the IWR 
within the PE. We assume that the time to transfer one operand from the GDR or the IDB to a PE (i.e., operand 
fetching) is the same as the time to transfer the output result from a PE to the GDR (i.e., result storing) and equals to 
the computation time of one basic PE operation. This time interval is called a cycle. Since operands fetching, com- 
putation, and result storing can be performed simultaneously due to the data buffers designed in this system, a 
three-stage pipelined operation can be performed on our DN-SIMD machine. Since a computation usually needs 
two operands A and B, and if A and B come from different sources, then they can be transferred to a PE simultane- 
ously in one period. In this case, the three-stage pipelined operation proceeds normally. However, if A and B come 
from the same source (e.g., GDR or IDB), then it will take 2 cycles to transfer them. This situation is called the 
double transmission required (DTR) computation. In this case, a delay period must be added to the pipeline opera- 
tion to synchronize the operation. This DTR computation obviously will slow down the system speed. Hence, we 
need to minimize the number of DTR computations in a computational task. 

5. Mapping of Parallel Robotic Algorithms onto the Dual-Network SIMD Machine 

Since our DN-SIMD machine was designed to best match the common characteristics of the six basic robotics 
parallel algorithms, the scheduling of their computations in our system is more straightforward with less difficulties 
as compared with other general mapping problems. Based on this characteristics matching, a systematic and 
efficient mapping procedure is developed to map the parallel robotic algorithms onto the proposed medium -grained 
DN-SIMD machine. 

The proposed mapping procedure consists of three stages [2], In the first stage, each of the single steps of 
these parallel robotic algorithms is further decomposed to a set of “subtasks” and each subtask possesses the basic 
mathematical form of consisting at most two operands. On the other hand, each of the macro steps in these algo- 
rithms is viewed as a subtask and is not decomposed at this stage. The first stage results in a series of parallel sub- 
tasks. In the second stage, these subtasks are reordered to reduce the number of DTR operations through a neigh- 
borhood scheduling algorithm. The reordered subtasks will be mapped onto the DN-SIMD machine directly in the 
third stage. In the final stage, the actual implementation of the macro steps in the parallel algorithms on the DN- 
SIMD machine is performed. Using the benchmark algorithm in Table 2 as an example, the details of these three 
stages of our mapping procedure are discussed in the following subsections. 

5.1. Subtask Assignment 

Since the proposed DN-SIMD is a medium-grained machine and is synchronized at each basic mathematical 
operation, each parallel algorithm must be decomposed into a series of subtasks. Each subtask is either in the basic 
mathematical form which involves at most two operands or in a well-defined macro step. Although this functional 
decomposition can be easily performed on the single steps, it is not the case for the macro steps, in which the data 
dependencies are so complex that the decomposition based on basic computational unit is not obviously feasible. So 
the macro step will be viewed as a single subtask in this stage. Consider the decomposition of the following equa- 
tion 

K = L x (C + E) + G x C . (4) 

Here we use three temporary variables, T\, T 2 , and T 3 to rewrite Eq. (4) into four simple equations in the basic 
mathematical form: 

Ti = C + E , r 2 = LxT 1 , Tj = G x C , and K = T 2 + T^ (5) 

This same technique is applied to our decomposition process for single steps. For clarity, the benchmark 
algorithm is used as an example to demonstrate the technique. The decomposition result and the original algorithm 
are shown in Table 2. Here, two sets of variables are introduced: 7)’ s represent the immediate results (temporary 
variables) or the final outputs. If T; is a macro subtask, then it is specially denoted as f,. /,’ s represent the external 
input variables; that is, the variables that do not come from the outputs of other computations. 

To ease the subtask scheduling in the second stage, notation simplification is performed on the above task 
table to produce a simplified task table as shown in Table 3. In this table, two arrays are defined: TB[z] contains the 
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identification of subtasks T t 's and OP[i] represents the corresponding operation for subtask TB[i], Each element of 
OP[i] is either a macro subtask or in the form of A oB, where A and B may be T, (7)) or Moreover, the super- 
script on A or B indicates the difference between the index i of the result, Tj t [i], and the index k or / of its operand 
T jt [k] or T Jt lH, where Tj t [/] = T jt [k] For example, the subtask T x [i] = r 2 [i+2] oF 3 [i-l] is denoted as 

Ti=T\o Tj } . If their indices are equal, that is, i = k or j = /, then the superscript is omitted. For example, the sub- 
task r 4 [i]= r 5 W o FJi] is denoted simply as F 4 = T 5 o T 6 . The simplified task table is the final result of this stage 
and will be used as the input for the next stage. 

5.2. Subtask Scheduling 

To schedule the subtasks for computation, we first observe all the possible operand sources and their combi- 
nations for each computation. The operand may be one of the four possible types denoted as S h S 0 ,, S T , and S OT 
which correspond to four kinds of different sources. S, denotes the operand from the IDB and it needs one period of 
transmission time. S 0 , denotes the operand which is fetched by the previous computation (subtask) from the IDB 
and is still in the IWR within the PE, so no transmission is required for this operand. S T denotes the operand from 
the GDR and this operand requires one cycle of transmission time via the network RIN1. S OT denotes the operand 
from other sources including the following three possibilities: (i) The operand which is fetched by the previous 
computation from the GDR and is still in the IWR within the PE, so no transmission time is required; (ii) Current 
computation result through the inner loop; (iii) Current computation result through the internal forwarding path with 
data exchange provided by the network RIN2. The transmission time for the last two cases is ignored when com- 
pared to the system cycle time. Using these notations, all the possible combinations of operand sources including 
the situation of only one operand are listed below: 


(S ', , si) 

(S , , S T ) 

(Sqt > Sr) 

(Soi » Soi) 

(S,) 

(S, , So,) 

Soi » Sqt) 

(S' t . S" T ) 

( Sqt * Sqt) 

(So,) 

(S, , Sqt) 

( So / » S T ) 

(S,,S,) 

(St » St) 

(St) » (Sot) 


where the prime superscripts are used to distinguish different operands from the same kind of source. Among these 
situations, the combinations (S ', , S",) and (Sj , St) are DTR operations and require two cycles to transmit two 
operands through the same transmission path. It is possible to eliminate DTR operations, if we reorder the process- 
ing sequence without violating the constraint of precedence relation. That is, in these two situations, one operand S T 
(or Sj) can become the type S OT , or 5) (or S/') can become the type S 0 ,. Then, the DTR operation phenomena can be 
avoided and the unnecessary transmission can also be avoided for the efficient use of the same data repetitively and 
instantly. 

A neighborhood scheduling algorithm for scheduling and reordering the execution of these subtasks to minim- 
ize the total number of DTR operations has been developed and is considered here. 

Definition 1. For two subtasks in the Ath and /th rows of the simplified task table, TB[k] and TB[l\, assume 
OP[k]=AoB and OP [/ ] = C o D, where A, B, C, and D are operands, each with one of these possible types: 
[Ij, Tj,Tj}. Then the subtask TB[k] is called a neighborhood of TB\l\ if all the following conditions are satisfied: 

(i) k < l, 

(ii) C = TB[k] or C = TB’[k] orC = AorC = Bor 
D = TB[k]otD = TB‘[k] oxD = A orD =B . 

From the above definition, we know that if subtask TB[l\ has a previous subtask TB[k] as its neighborhood 
(k < /) and moreover, if these two subtasks are next to each other, i.e., I = k + 1, then at least one operand of subtask 
TB[F] comes directly from the result or operand of subtask FB[k] without accessing the GDR or the IDB. This obvi- 
ously will save the communication time to access global memories, and the subtask TB[l] will never be a DTR sub- 
task, thus minimizing the number of DTR subtasks. 

Definition 2. A subtask in the Ath row of the simplified task table TB[k\ is called a double transmission 
required (DTR) subtask if the following two conditions are satisfied: 

(i) Its operand is one of these types: 

OP[k] = 77? [m] o TB[n ] for some m, n < k and m*n . 

OP[k] = TB‘[m] o TB[n] for some m, n < k and m * n . 

OP[k] = TB[m] o TB‘[n] for some m, n <k and m*n . 

OP[k] = TB‘[m] o TB J [n] for some m, n < k and m *n. 

OP[k) = l[m] o I[n] for m * n . 

(ii) k = 1 or 7'B[A-1] is not a neighborhood of TB[k] for k > 1 . 
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Notice that for OP[k\ = TB[m ] o I[n\ and OP[k\ = TB‘[m] o /[«], the subtask OP[k] is not a DTR subtask because its 
two operands can be transmitted simultaneously through two different set of connection lines. Moreover, a subtask 
involves only one operand is obviously a non-DTR subtask. For example, in the simplified task table of the bench- 
mark algorithm, subtasks T 7 , T 9 , T 12 , T 13 , T 15 , and T l7 are all DTR subtasks as indicated in Table 3. 

From the above definition, whether a subtask is a DTR subtask depends on its “position” in the simplified 
task table. A DTR subtask can become a non-DTR subtask if it is moved to the place exactly behind its neighbor- 
hood. Since it is possible that the movement of a DTR subtask may introduce another new DTR subtask, this reord- 
ering process is desirable only when it complies with the precedence constraint of the original algorithm and the 
number of DTR subtasks in the reordered task table is less than that in the original table. This forms the scheduling 
problem ; that is, to reorder the processing sequence of subtasks to reduce the number of DTR subtasks as far as pos- 
sible without violating the precedence constraint of the original algorithm. This reordering process can be per- 
formed by the following efficient neighborhood scheduling algorithm. 

Algorithm N-Scheduling ( Neighborhood Scheduling Algorithm). 

Input: Simplified Task Table with n rows (i.e., n subtasks). 

Output: Reordered Task Table. 

Nl. [Main Loop] Check each subtask to see if it is a DTR subtask. If yes, try to change its position. 

For k = 1 step 1 until n do 

N2. [Check DTR] 

Check if TB[k] is a DTR subtask according to definition 2? If not, go to step N4. 

N3. [Main Body] Try to change the position of a DTR subtask to make it into a non-DTR subtask. 

If OP[k] = (7B[m] or TB a [m]) « (TB[n] or TB h [n]), 

then let i <— max (m, n); 

else let * «- 1; {* OP[k] = I[m] o I[n] * } 

End [If] 

While i < k-l do 

If 7B[r] is a neighborhood of TB[k], then 

If [7B[i+l] is a DTR subtask} or (the insertion of TB [k] between 7B[r] and 
7B[i+l] will not make 7B[i+l] a DTR subtask}, 

then insert TB[k\ behind TB[i] to make TB[k\ the new (j+l)th subtask; 
go to step N4 
End [If] 

End {If} 

Let i <— j+1; 

End {While} 

N4. Continue {main loop} 

End {For) 

END. {N-Scheduling} 

As an example, the N-Scheduling algorithm is applied to die benchmark algorithm. The input is the 
simplified task table in Table 3, which has a total of 18 subtasks and six of them are DTR subtasks. After applying 
the N-Scheduling algorithm to this simplified task table, the reordered task table is produced as shown in Table 3, in 
which all the DTR subtasks in the simplified task table have been removed. 

5.3. Mapping Procedure 

The reordered task table produced by the N-Scheduling algorithm can be mapped onto the proposed DN- 
SIMD machine in a rather straightforward way because these subtasks are all single-step, simple snhtackc If the 
subtasks are macro steps, then their mapping requires further consideration. Our mapping procedure at this final 
stage consists of two phases. In the first phase, the subtasks including single steps and macro steps which are 
viewed as single steps temporarily are mapped onto the DN-SIMD machine in a row directly. The actual mapping 
of the macro steps is considered in the second phase. The output of the mapping procedure is a control table as 
shown in Table 4. This table consists of ten columns and indicates the exact movement of the central control unit. 
The first column represents the identification of subtasks appearing in processing order. It also represents the result 
of the corresponding subtask. The second column indicates the first operand; it may be 7) ( 7} , fj , Tj ) or 7 ; for 
some j. The third column indicates the source of the first operand, and there are five possibilities: the GDR, the 
IDB, the 1FD, the IWR and the OWR within the PE. The fourth column describes which network is used (RIN1 or 
RIN2) and the required connection type on it to transmit the first operand if necessary. Columns 5 to 7 contain the 
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same information as the previous three columns, but for the second operand if it exists. Column 8 indicates the 
operation performed in this subtask. Column 9 indicates the destination of the result; it may be the GDR, the IFD, 
or both. If the IFD is needed, the connection type of network RIN2 is specified. Column 10 contains some com- 
ment on this subtask. For a macro subtask, these columns possess somewhat different meanings. Columns 2-7 indi- 
cate the corresponding information for the initial conditions of the macro subtask (similar to the parameters for a 
subroutine in a serial program). Columns 9-10 indicate the corresponding information for the final result of the 
macro subtask (similar to the return values of a subroutine in a serial program). 

At the end of phase 1 of the mapping procedure, the control table of the benchmark algorithm is obtained as 
shown in Table 4. Since there are three macro subtasks in the control table, further mapping must be performed in 
phase 2. Among these macro subtasks, T j and T & are the HLR equations, and T\\ is the HHLR equation. The map- 
ping of HLR equations are demonstrated next. 

The first-order homogeneous linear recurrence equation is defined as: Given x(0) = a(0)= null, and 
a (i), 1 < i <n, find all the x(i) for 1 < i <n from the following recursive equation 

x(i) = x(i-l)oa(0. ( 6 ) 

An efficient technique called the recursive doubling technique has been found to solve this recursive equation 
efficiently on an SIMD machine [7,19]. Using this technique, the parallel algorithm to solve Eq. (6) and the map- 
ping diagram of this algorithm onto the proposed DN-SIMD machine are shown in Fig. 3. This diagram possesses 
the same information as a control table including the sources of operands, destination of result, network used and 
required connection types for each iteration. It takes an order of 0({ log 2 (n +1)] ) iterations to produce the final 
results. Also notice that, in Fig. 3, we assume that the initial conditions a(i)’s come from the IDB. In fact, they 
may also come from the GDR depending on whether a (i)’s are external input variables or not. In that case, its map- 
ping diagram is exactly the same except that the a(i)’s are from the GDR through the network RIN1 at the begin- 
ning. Similarly, the final results x (i’)’s can be stored in the GDR or directly fedback to PEs depending on the neces- 
sity of the next subtask. Using the similar techniques, the mapping of HHLR equations can also be performed [2], 

6. Conclusions 

To design a global architecture for a set of parallel robotics algorithms, the characteristics of these algorithms 
are identified according to six fundamental features: degree of parallelism, uniformity of operations, fundamental 
operations, data dependency, and communication requirements. Considering the characteristics matching between 
the common features of the robotics algorithms and the architecture features, a medium-grained, DN-SIMD 
marhinp i s designed. It consists of two sets of reconfigurable interconnection networks. One provides the commun- 
ication between the PEs and the GDRs. The other provides the internal direct feedback paths among PEs to avoid 
unnecessary data storing and routing time. This machine performs three-stage pipelined operations and is synchron- 
ized at each basic mathematical calculation. 

With the parallel robotics algorithms and the proposed DN-SIMD parallel machine, a systematic mapping 
procedure to schedule the subtasks of the parallel algorithms onto the parallel architecture is developed. This map- 
ping procedure consists of three stages. At the first stage, mathematical decomposition is performed on the parallel 
algorithms to achieve a series of subtasks and each subtask is either in the basic mathematical form which involves 
at most two operands, or a well-structured macro subtask such as the linear recurrence equations. At the second 
stage, to shorten the communication time, the processing sequence of subtasks is reordered to minimize the total 
number of DTR subtasks using the Neighborhood Scheduling algorithm. At the final stage, the reordered subtasks 
are mapped onto the DN-SIMD machine. In this process, the single-step subtasks can be mapped directly, while the 
macro-step subtasks need further design and special technique such as the recursive doubling technique for solving 
the linear recurrence equations. A benchmark algorithm was used throughout as an example to illustrate the map- 
ping procedure. 
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Table 1 . Characteristics of Basic Robotics Algorithms. 


Algorithms 

CHARACTERISTICS 

Type of 
Parallelism 

Degree of 
Parallelism 

Uniformity of 
Operations 

Fundamental 

Operations 

Data 

Dependency 


Inverse 

Dynamics 

Job level 

Large grain 

Yes 

Matrix-Vector 

HLR 

(Regular) 

Permutation 

Forward 

Dynamics 

Job level 

Large grain 

Yes 

Scalar ops. 
Reciprocal 
Matrix-Vector 

HLR.HHLR 
SHLR, PNE 
System of Linear Eqs. 

one-to-one 

Permutation 

Broadcast 

Forward 

Kinematics 

Job level 

Medium or 
Fine grain 

Yes 

Matrix Mult. 
Trigonometric 

HLR 

(Regular) 

Permutation 

Forward 

Jacobian 

Job level 

Medium or 
Fine grain 

Yes 

Matrix-Vector 

HLR (Forward 
& Backward) 

(Irregular) 

Permutation 

Broadcast 

Inverse 

Jacobian 

(Direct) 

Job level 

Medium or 
Fine grain 

Yes 

Scalar ops. 
Reciprocal 
Vector ops. 

Global 

Permutation 

Broadcast 

Inverse 

Jacobian 

(iterative) 

Job level 

Medium or 
Fine grain 

Yes 

Scalar ops. 
Reciprocal 
Vector ops. 

Local 

Permutation 

Broadcast 

Inverse 

Kinematics 

(Direct) 

Task level 

Fine grain 

No 

Scalar ops. 
Reciprocal 
Square root 
Trigonometric 

Global 

one-to-one 

Broadcast 

Inverse 

Kinematics 

(iterative) 

Job level 

Medium or 
Fine grain 

Yes 

Scalar ops. 
Reciprocal 
Matrix-Vector 
Trigonometric 

Local 

one-to-one 

Permutation 

Broadcast 





























































Table 2. Robotics Benchmark Algorithm and Subtask Assignment. 


G, 

Equations * 

Ti 

Subtasks 

G i 

A [1 ] =A[i-l] x B [i ] , 2 < i <, n, A [ 1] = B [1] 

a 

f 1 [ t ] = f 1 [i-l]x/ 1 [i], 2 <i<n, f,[l] = /i[l] 

g 2 

C[i] = A[i]-D[i] 

T 2 

r 2 [i] = f 1 [<l*/ 2 [i] 

g 3 

E[i] = A[i]-F[i] 

T 3 

r 3 [i] = f 1 [/]-/ 3 [ l ] 

G 4 

G[i] = G[i-\] + (E[i]H [i ] V [i ], 2 < i < n 
G[1] = (£[1]//[1]V[1] 

r 4 

r 4 [i]=r 3 [j]/ 4 [t] 

T 5 

T 5 [i) = T A [i]I 5 [i] 

T 6 

T 6 [i ] = f 6 [i- 1] + T s [i ], 2 <i<n, f 6 [ 1] = r 5 [l] 

g 5 

K[i] = L[i](C[i]+E[i]) + G[i]+C[i] 
M[i] = N [j ]M [i -1 ] + K [i ],2<i<,n 
M[l] = K[l] 

T-, 

T 1 [i) = T 2 [i] + T 3 [i] 

T g 

Tg[i) = T 1 [i]l 6 [i] 

t 9 

T 9 [i] = f 6 [i] + T 2 [i) 

BBl 

T l0 U] = Ti[i] + T 9 [i] 

EH 

T 11 [»]=/ 7 [i]Tii[i-l] + r 1 o[i], \<i <n, f„[n] = 7’ 10 [n] 

g 6 

0[i] = (E [i -1] + Af [/ ]) + (G[i]+K [i +2]) 
P[i]=A[i)>G [*] 

Q [J] = M [j ] *C [i+ 1 ] + 0 [i +1 ] •/> [i ] 

Tn 

T l2 [i] = T 6 [i] + T l0 [i+2] 


7’i 3 [i] = r 3 [i-l] + f„[i] 

am 

T 14 [i] = 7’i 2 [i] + T i 3 [/] 

T 15 

Ti S [i) = fdi]-T 6 [i] 

Tie 

T i 6 [i] = T i 4 [i+l]*7’ is[* ] 

Tn 

7 , i7[«] = r 2 [i+l]-f 11 [i] 


Ti 8 [i]=r 16 [i] + r 17 [i] 


where in this table, except separate indication, i is from 1 to n. 

B [i ], D [i ], F[/], H [i ], J [i ], L [t ], N [i ]; i = l,...,n; arc assumed to be input variables. 
B [i ] is a 3x3 matrix. D[i],F [/ ] are 3x1 vectors. 

H[i],J [i ], L [i ], N [i ] arc all scalars. 


































Table 3. Simplified and Reordered Task Table of Benchmark Algorithm. 



SIMPLIFIED TASK TABLE 

REORDERED TASK TABLE 

ROW 

TB[ROW] 

OP[ROW] 

DTR 

TB[ROW] 

OP[ROW] 

DTR 

1 

mm 

f'xlx 


T\ 

TT’x/i 


2 

T 2 

T 1^2 


t 2 



3 

T 3 

f 1^3 


t 3 



4 

t a 

T 3 / 4 


7* 7 

t 2 +t 3 


5 

T 5 

7* 4/5 


7* 4 

7 - 3/4 


6 

T 6 

Te+Ts 


7* 5 

TVs 


7 

7*7 


X 

T 6 

7? +7-5 


8 

T s 

Tih 


WEM 

7*i * 7-5 


9 

t 9 


X 

T 9 

^6+7*2 


10 

7*,o 

T’g+T’j 


7*g 

7 * 7/6 


11 

Tn 

/ 7 r u +rio 


T 10 

T s +T 9 


12 

7*1 2 

f«+no 

■a 

7-12 

76+7*10 


13 

■■ 

i?+r„ 

X 

fi. 



14 

7*14 

^12+^13 


7 ", 7 

7? *7*11 


15 

mm 

fpf* 

B 

■B 

T?+f n 


16 

mam 



7-14 

^12+^13 


17 

^17 

n'-fn 

X 

■B 

774 * 7*15 


18 

7*,g 

T 16+^ 17 


7*1* 

7*16+7*17 



where T x and T 6 are HLR equations and 7* n is an HHLR equation. 
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Table 4. Control Table for Benchmark Algorithm. 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Ti 

Operand 

Source 

Network 

Operand 

Source 

Network 

Operation 

Output Destination 

Comment 


1 

1 

Type 

2 

2 

Type 


GDR 

IFD 

RIN2 

B 

T i 

IFD 

♦ 

/ 1 

IDB 

- 

MM 

X 


- 

* HLR Eqn. 

t 2 

Ti 

OWR 

- 

h 

IDB 

- 

MV 

X 


- 


t 3 

T\ 

IWR 

- 

h 

IDB 

- 

MV 

X 


- 


Ea 


OWR 

- 

T 2 

GDR 

RIN1-1 

VA 

X 


- 


EBI 

t 3 

IWR 

- 

U 

IDB 

- 

SV 



- 


T S 

T a 

OWR 

- 

h 

IDB 

- 

sv 



- 


T 6 

Te 

IFD 

* 

T 5 

OWR 

- 

VA 

X 


- 

* HLR Eqn. 

T 15 

T 6 

OWR 

- 

T i 

GDR 

RIN1-1 

MV 

X 


- 


t 9 

T 6 

IWR 

- 

t 2 

GDR 

RIN1-1 

VA 

X 


- 


Ti 

Ti 

GDR 

RIN1-1 

h 

IDB 

- 

SV 



- 



Ti 

OWR 

- 

T 9 

GDR 

RIN1-1 

VA 

X 

X 

2 


Tn 

t2 
* 10 

IFD 

RIN2-2 

t 6 

GDR 

RIN1-1 

VA 

X 


- 


f ii 

h 

IDB 

- 

♦ 

* 

* 

* 



- 

* HHLR Eqn. 

Tl7 

T u 

OWR 

- 

T? 

GDR 

RIN1-3 

VI 

X 


- 


T,3 

Tn 

IWR 

- 

T? 

GDR 

RIN1-4 

VA 



- 


T 14 

Tn 

OWR 

- 

T 12 

GDR 

RIN1-1 

VA 


X 

3 


^16 

n a 

IFD 

RIN2-3 

T 15 

GDR 

RIN1-1 

VI 



- 


Tig 

T 16 

OWR 

- 

Tn 

GDR 

RIN1-1 

SA 

X 


- 

Result 


Connection type 1 : straight connection; Connection type 3: uniform module shift (d = 1) 

Connection type 2: uniform module shift ( d = 2); Connection type 4: uniform module shift (d = -1) 























































ALU : Arithmetic Logic Unit 
IR: Input Working Register 
OR : Output Working Register 


Figure 1 . (a) Structure of Dual Network SIMD Machine, 
(b) The Structure of Processing Element. 
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(a) Input Operands Flow Diagrams 



(b) Output Result Flow Diagrams 
Figure 2. Data Path Flow of Processing Element 
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Algorithm FOHRA ( First-Order Homogeneous Recurrence Algorithm). 

FI. [Initialization] Given the terms Oi,0<.i<n, let X w (i) be the ith sequence at the *th splitting 
and j =[ log 2 (n+l)]. Set the sequence at the initial step, X (a> (i) <- a,, o<i<n. 

F2. [Compute jc, parallelly] 
for k. <r- l to s, do 

X (k ~ l) (i - 2*-' ) * X (k ~ l) (i) , if 2 k ~ l Zi<n 

X (k) (i)=- 

X (t-1) (/) , if 0 < / < 2* _I 

end [for] 

Set Xi <— X is) (i), 1 < / < /i. 

END FOHRA. 



RIN2 RIN2 RIN2 


Figure 3. Mapping Diagram of First-Order Homogeneous Linear Recurrence Equations on 
DN-SIMD Machine. 
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